So I was beginning to think I was going to actually have to write my own css parser…then I found 3 different .jj grammar files (for javacc) that I should be able to bend to my will. One of them is actually used for the w3c’s css validator. I also found a promising html parser too (also javacc generated). It implements a nice visitor pattern as well as supporting annotations on the nodes…so I can write a visitor that will visit each node and leave annotations indicating what css rules matched…then visit it again with a visitor that will actually generate the merged file. Better than what I’d originally thought I’d do…with the added benefit of not requiring strict xml conformance in the markup file.