GutenMark Obsolete Download Page
But still may some useful info


home
features
download
usage
FAQ
changes
bugs
links
developer
Ladders, by Lynnie Rothan


This page contains superceded but still conceivably-useful information.  Most people will be more interested in the newer download page.  The present page is no longer updated.

Contents

License
Downloading GutenMark
Installing GutenMark
Compiling GutenMark
GutenSplit
Other Stuff You Might Want

License

GutenMark is freely available under the terms of the GNU General Public License (GPL).  You may view the text of the GPL here , or you may visit the Free Software Foundation for more explanation.


Downloading GutenMark

If you have one of the directly-supported platforms, choose the appropriate "base package".  The base package contains documentation, binary executable, and configuration files, but it still can benefit from downloading additional wordlists (see below).  If you don't have one of the directly-supported platforms, or if you would just like to have the source code, download the source package instead.  If you want the bleeding-edge source code (rather than the released code), choose the "development" source package.

Software

Description
Download software
Current development source package (20080420) GutenMark_source_dev-20080420.tar.gz
Experimental complete installation package for Win32 or 'x86 Linux, including GUI program GUItenMark.  If you download this, you don't need any of the stuff below!  However, the instructions on this page are not up-to-date with how to use this package.  Instead, see the instructions in the change log for 04/20/2008.
GUItenMark-demo.zip
More-current executables, if you don't
feel like rebuilding the development snapshot.
Download the appropriate "base package"
below, and then download the executables
you need into the GutenMark-source
directory.
GutenMark Linux (20040320):  GutenMark
GutenMark Win32 (20040320):  GutenMark.exe
GutenMark Mac OS X (20040320):  GutenMark
GutenSplit Linux (20040221):  GutenSplit
GutenSplit Win32 (20040221):  GutenSplit.exe
GutenSplit Mac OS X (20040221):  GutenSplit
20021216 Win32 base package
Zipfile
20020722 Win32 base package Zipfile
20020714 Linux 'x86 base package Tarball
20020714 Linux PPC base package Tarball
20020714 FreeBSD base package Tarball
NetBSD packages
(thanks to Thomas Klausner)
http://www.netbsd.org/packages/textproc/GutenMark/README.html
http://www.netbsd.org/packages/textproc/GutenMark-words/README.html
20020714 Mac OS X base package Tarball
20020714 source package Tarball
Obsolete versions FTP

The "wordlists" and "namelists" are optional files that you can download or not, as you choose.   The wordlists are categorized as highly recommended, recommended , or available, based on my own admittedly subjective experience.   Click here for an extended explanation of what wordlists do.  If you want to download several (or all) wordlists, you might prefer to use an FTP client rather than your browser.

Wordlists and Namelists

Description
Version
Download
My own special English wordlist Jan. 7, 2003 1K
U.S. namelist Nov. 10, 2001 348K
U.S. place names Dec. 18, 2001 144K
French namelist Nov. 11, 2001 7K
English wordlist Nov. 10, 2001 449K
French wordlist Nov. 17, 2001 373K
German wordlist  (It has been reported
that this wordlist is very poor.  You
might want to read the explanation
before downloading it.)
Nov. 24, 2001 582K
Older, smaller German wordlist Nov. 11, 2001 209K
Latin wordlist Nov. 16, 2001 195K
Italian wordlist Nov. 11, 2001 383K
Spanish wordlist Nov. 11, 2001 322K
Non-U.S. place names Dec. 22, 2001 5992K (Really really big!!)
Norwegian wordlist Nov. 16, 2001 2078K (Really big!)
Gaelic wordlist Nov. 11, 2001 298K
Danish wordlist Nov. 11, 2001 558K
Swedish wordlist Nov. 11, 2001 254K
Finnish wordlist Nov. 11, 2001 285K
My own special non-English wordlist Nov. 24, 2001 1K
browse ... all FTP


Installing GutenMark

... on Win32

  1. Unzip the base-package zip-file with WinZip, pkunzip, or whatever software you have that's appropriate.  This will create a directory called "GutenMark-install" containing the executable file (GutenMark.exe) and all of the documentation.
  2. Add this directory to your PATH, or copy GutenMark.exe and GutenMark.cfg to some directory that's already in your path.
  3. If you download any of the optional wordlists or namelists, put them wherever you put GutenMark.exe.  Don't uncompress the wordlists.
  4. Depending on the wordlists you've downloaded, the native languages of the etexts you're interested in, and your own personal tastes, you may want to reconfigure the software. Important note :  Prior to version 20020721, it was necessary to edit the default configuration file so that it contained the exact pathnames of the wordlists.  In versions 20020721 and later, this inconvenient step can be omitted.
  5. You can read the documentation by looking at GutenMark-install\index.html with your web browser.

... on Linux (Intel or PPC), Mac OS X, or FreeBSD

  1. Expand the base-package tar-archive with gunzip and tar.  For example:
    1. gunzip GutenMark_MacOS-X_xxxxxxxx.tar.gz
      tar -xf Gutenmark_MacOS-X_xxxxxxxx.tar
    This will create a directory called "GutenMark-install" containing the executable file (GutenMark) and all of the documentation.
  2. Add this directory to your PATH, or copy GutenMark and GutenMark.cfg to some directory that's already in your path.
  3. If you download any of the optional wordlists or namelists, put them wherever you put the executable (GutenMark).  Don't uncompress the wordlists.
  4. Depending on the wordlists you've downloaded, the native languages of the etexts you're interested in, and your own personal tastes, you may want to reconfigure the software. Important note :  Prior to version 20020721, it was necessary to edit the default configuration file so that it contained the exact pathnames of the wordlists.  In versions 20020721 and later, this inconvenient step can be omitted.
  5. You can read the documentation by looking at GutenMark-install/index.html with your web browser.
  6. If you download the optional man page (not included in the tar archive), copy it wherever man pages go on your system.  You can find out where that is by using the command "man --where tail", and seeing the directory used for installation of the "tail" man-page.  (Thanks to Dave Mitchell for this tip.)  Or you can simply read the man-page in place, like so:
    1. man -l GutenMark.1 GutenMark

Compiling GutenMark

If you don't have any of the platforms for which an executable program is supplied, or if you would like to modify the program, then you need to compile GutenMark yourself.  This is easy on any system that has the GNU compiler gcc and the GNU make program.  You can obtain gcc and make for free from GNU .  When compiling for Win32, the version of gcc called mingw32 (see Mumit Khan's web site ) is used.

NOTE:  In versions later than 20011113, support for Borland's free C++ compiler (see  Borland's web site ) has been dropped, because it was just too much effort for me without knowing if anyone was interested.  If for some reason you don't want to use mingw32, and if you figure out how to get other C compilers such as Borland's or Microsoft's to work, tell me ; I'll post the instructions here.

Requirements

You need to have the compression library zlib installed.  This can be obtained for free from www.zlib.org , but is already present on every *NIX system I personally have tried.  For Win32, I've included a pre-compiled zlib library with the GutenMark distribution, so you don't have to worry about it.

... on Win32

I actually build the Win32 versions of the executables from Linux.  But you can also build them from Windows as follows.

Unzip the source code, and change to the GutenMark-source directory from the DOS command line. To compile with mingw32,
make GutenMark.exe
(This assumes that the name of the GNU make program that you got with mingw32 is actually accessible by typing "make".  If it instead calls up some other make program, such as Microsoft's or Borland's, then the software-build will not work properly.)  In addition to compiling GutenMark.exe, this will attempt to test the compilation by running GutenMark.exe to produce sample HTML file (bldhb10.html) which it compares to an equivalent HTML file (bldhb10.txt.html) provided with the distribution.

... on *NIX

Expand the source-code archive with gunzip and tar, and (in a text console) change to the GutenMark-source directory.  To compile, simply run make (gmake in FreeBSD).

In addition to compiling GutenMark, this will attempt to test the compilation by running GutenMark to produce sample HTML file (bldhb10.html) which it compares to an equivalent HTML file (bldhb10.txt.html) provided with the distribution.

If you have a Linux version of the MinGW compiler installed, the build will actually create both Linux and Win32 versions of each executable.  In other words, you'll get a file called "GutenMark" (for Linux) and a file called "GutenMark.exe" (for Windows).  However, I'm fully aware that most people won't have a MinGW cross-compiler installed, so don't worry:  you'll still be able to build the Linux executables without difficulty.


GutenSplit

Over the years, several people have asked me to include a command-line switch in GutenMark that produces an HTML file for each chapter rather than a single, huge HTML file.  Well, I haven't done that, but (as of 01/21/04, anyhow) you'll now find that compiling GutenMark results in a program called GutenSplit appearing in the GutenUtilities directory.  (I've also provided the GutenSplit executables as separate downloads above.)  GutenSplit is a stand-alone utility that can split the HTML file created by GutenMark into smaller HTML files. 

The file is split at the headings (which are usually the boundaries of chapters).  A table of contents HTML file is added, and hyperlinks are added to move among all the new HTML pages.  Click here to see an example of this (using the usual sample text).

The syntax for GutenSplit is as follows:
GutenSplit InputHtmlFilename OutputBasename

The OutputBasename is used by GutenSplit to name all of the smaller HTML files.  The names of the small HTML files are created by adding the suffixes 000.html, 001.html, etc. to the OutputBasename.  Suppose, for the sake of argument, that GutenMark has created a big HTML file called "TomSawyer.html".  Then, the command
GutenSplit TomSawyer.html Tom
would create a series of files called "Tom000.html", "Tom001.html", "Tom002.html", and so on.  Tom000.html happens to be the table of contents.

I suppose that, in theory, GutenSplit might work for other HTML files---other than those created by GutenMark, I mean---but I don't guarantee it.


Other Stuff You Might Want

The function of GutenMark is merely to convert the Project Gutenberg etexts to marked-up HTML or LaTeX.  If you intend to use LaTeX, I won't offer you advice because you probably know much more about the available utility software than I do.  But if you intend to use HTML, I can give you some hints.

If the HTML is all you want -- if you want to read the etext online, or to set up a web site that displays PG texts in HTML, or if you're fine with printing etexts from your browser, or if you want to use the HTML as a starting point for further markup -- then you're all set!

If, on the other hand, you don't want to use LaTeX and you are looking for an end-to-end solution that can produce attractive printable texts like this sample , then you need some better way of printing HTML than your browser can provide.  You could, of course, load the HTML into Microsoft Word or some other word processing program, and manipulate the document format manually.

The solution I would choose instead is to use a utility program that can convert HTML to Postscript printer language, or to PDF format.  Several such free utilities are available.

Description
Sample
page
Version
Configuration
file
8.5"×5.5" 9pt New Century Schoolbook font  page9schoolbook.pdf Nov. 18, 2001  half9schoolbook.rc
8.5"×5.5" 10pt Times Roman font   page10times.pdf Nov. 17, 2001  half10times.rc
8.5"×5.5" 10pt Bookman font   page10bookman.pdf Nov. 17, 2001  half10bookman.rc
8.5"×5.5" 12pt New Century Schoolbook font   page12schoolbook.pdf Nov. 25, 2001  half12schoolbook.rc
  • htmldoc is available for either for Win32 or in source-code form (for Linux systems), and has some very nice properties.  I personally find it a little buggy, but it's apparently under active development and can presumably only get better.  The main problem is that it is very bad at right justification (or at least, I haven't figured it out), and so you need to use ragged-right text.

  • ©2001-2005,2008 Ronald S. Burkey.  Final update 04/21/2008 by RSB.  Contact me .