HTMLTidy is one of those wonderfully efficient little tools in the *nix tradition. Like the other tools of this type, it does only one thing, but does it well. It differs from most of the others in one respect though - it's name clearly describes what it does: tidy up messy html.
Have you ever worked on a project that involved editing html pages from a variety of sources? Ever had to work with an MSWord document saved as html? What about the chunderous mess that's spewed from some of those WYSIWYG web tools? Or maybe you create your pages the
Tidy fixes a number of common, and not so common, mistakes in HTML files. It does this by analyzing the markup in a file and comparing it to the HTML 4.01 specification. Depending on the options you specify, Tidy can fixes the problems it finds or it can generate a log detailing the errors.
The range of problems Tidy can fix is impressive. It can add missing or mis-matched end tags, correct tags that are in the wrong order, insert quotes around attributes, and can even add missing > to a tag. One of the few things Tidy can't do is add SUMMARY
Features:
* Suggests fixes and improvements for common errors found in HTML, XHTML and XML documents.
* Check multiple documents through the Batch Action Wizard.
* Ability to read settings from a default Tidy config file [new].
* Convert documents to XHTML and XML formats.
* Upgrade FONT tags to style sheets.
* Remove optional end tags.
* Indent / beautify tags, attributes and/or content.
* Change tags and/or attributes to uppercase or lowercase.
* Strip surplus tags in HTML documents generated using Word.
* Check for accessibility.
*
The documentation would have to improve considerably to be atrocious, the module has some head scratching limitations, and is slow.
Installation on Darwin 8.5.0/perl 5.8.6 was a nightmare of dependency resolution, whether by hand or by cpan. The author might have mentioned that the htmltidy source and headers have to be present before installation in the instructions. While the documentation does mention to "tell the makefile that you're using ranlib", that convoluted set of instructions doesn't actually address the problem I had.
That aside, on