sraun | Jan. 6th, 2008

I have 208 HTML files. I need to find the first occurrence of text between H1 Tags - like so:

<H1 ALIGN=CENTER>
sample text
</H1>

and then drop the text between the TITLE tags in the HEAD region. Yes, the sample text I need to grab is always on the line after the first H1 tag, and is always the only text on that line. The H1 tag is always early in the BODY region. I would love to automate this - I've got Perl, Python, and the standard Unix command-line text processing tools.

Anyone have any suggestions, magic invocations, or whatever? I know this can be done in Perl, probably fairly easily - but I don't do enough Perl to write it myself, and I can't conceptualize how to make the processing go backwards using the standard Unix tools.

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Another of Scott's Homes on the Web

Jan. 6th, 2008

Jan. 6th, 2008

ISO Text Processing Help

Profile

November 2024

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags