=================================================
PRegEx Version 2.0 Specification and Documentation 
=================================================

What's New in 2.0?

    See section "What's New in 2.0", later in this file.

What is PRegEx?

	PRegEx is a free, cross-platform scripting Xtra for Director 11+.

	(For Director 7-10, you must download and use PRegEx version 1.0)

	It does searching, replacing, data extraction, and more.

	It provides the search features of PCRE (the Perl-Compatible
	Regular Expression library from http://pcre.org/), while adding
	its own replace capabilities.

	It also supplies Lingo versions of some powerful features of Perl
	pertaining to manipulating string data, lists, and property lists,
	and converting between all those different formats.

	You don't have to know anything about Perl to use it.

	You don't have to know about regular expressions to use it.  But
	you can do some pretty cool stuff if you do know about them.

	It uses the iconv library for full support of Unicode and all
	other character (text file) encoding formats.

Who should use it?

	If you have ever wished you could use Lingo to:

	 - do any kind of text searching
	 - modify strings
	 - parse anything
	 - extract data from a file
	 - standardize data formats
	 - clean/canonicalize/validate user-provided data fields
	 - manipulate lists and property lists
	 - copy or deep-copy lists / property lists
	 - reverse lists
	 - convert a list of one kind of thing into another kind of thing
	 - use custom sort functions to sort lists
	 - sort lists without modifying the original
	 - filter lists
	 - deal with binary data buffers in Lingo
	 - do any of the above with very large string buffers
	 - call a handler, passing arguments, and get return value
	 - have a way for a callback function to signal its caller
	 - quickly read/write entire files into/from memory
	 - globally map characters in buffers
	 - convert files between different character encodings
	 - etc.

   ... then PRegEx is for you.

Help! What is a Regular Expression?  What's going on here?

	Please see the Introduction and Examples sections near the end of
	this doc.  There are also lots of helpful tutorials on the web.

	If you have not used regular expressions before, then as you learn
	them, you will hardly believe how powerful they are.  They are
	like a whole new programing language unto themselves.  Enjoy.

What does it cost?

    Nothing.  PRegEx is a free, open-source project.

	See "PRegEx Licensing", below, for full details.


Where do I get the latest version?

	PRegEx released on the Web site http://openxtras.org/.

	Latest updates, notes, or issues will be posted there, too.


Who made it?

	PRegEx authors are:

	  Chris Thorman <chris@thorman.com>  
	     Ravi Singh <ravi@ravware.com>

	Philip Hazel (see below) wrote PCRE, upon which PRegEx heavily
	relies, but he was not directly involved in PRegEx itself.


What other libraries is it based on?

	PCRE 7.7

	PCRE, the regular expression library that PRegEx uses, is included
	with this distribution.  It was written by:

	Philip Hazel
	University of Cambridge Computing Service,
	Copyright (c) 1997-2008 University of Cambridge

    Please see http://www.pcre.org/license.txt for more info.

	ICONV 1.12

	The iconv library enables the file-reading and -writing features.
	It is available from the Free Software Foundation, here:
	http://www.gnu.org/software/libiconv/.  It is licensed under LGPL.

	DIRECTOR XDK 11

	Of course, PRegEx also uses MOA, the Macromedia Open Architecture,
	and is built using the Director 11 XDK from Adobe at
	http://www.adobe.com/.

Who supports it?

	Nobody supports PRegEx for free.  It's free to begin with.
	However...


Can I pay for support or additional features?

	If you need support for PRegEx for a project-critical need, we
	recommend that you hire someone to support that need.

	Because the source is OPEN, you are completely free to approach
	and make an offer to any anyone you like, and they are free to add
	your custom features or create any other derivative work you may
	require, subject only to the liberal licensing restrictions
	outlined in this document.

    You may especially wish to approach RavWare, one of the companies
    that helped write PRegEx.  Ravware is in the business of creating
    Xtras for others.  (See complete description up above.)
    http://ravware.com/

	Please do not be offended if the PRegEx authors or others that you
	approach are unable to assist you.  We apologize in advance if a
	lack of free or inexpensive or even available support means you
	are unable to use PRegEx for your project.

	On the other hand, we believe PRegEx is quite robust in its current
	feature set and anticipate you will have few problems making use
	of it.


Can I see some examples?  

	1) Some function descriptions include examples.

	2) See "Examples" section at end.

	3) See PRegExTestMovie.dir, which you should have received with
       this package.  It has a full test suite which can be used to
       torture-test every feature of the Xtra, including heavy leak
       testing.  There are literally hundreds of usage examples there.
       It also has a few fun little features that let you import the
       spec file you are reading now and manipulate it.

	   
How well tested is it?

	We feel that PRegExTestMovie.dir extensively tests all PRegEx
	features by calling it literally millions of times in 30 seconds
	or so, and thereby demonstrates that PRegEx is free of any leaks
	and that it performs with jaw-dropping speed.  Please try to prove
	us wrong.  We'd be grateful for bug reports.


Where do I send bug reports?

	Please send reports of confirmed or suspected bugs to:

		PRegEx Bugs <pregex-bugs@openxtras.org>

	Do not send the source code for your project.  Send the simplest
	possible 2-5-line example or set of steps, or a simple test movie
	that demonstrates the problem (without anything else in it).  Or,
	best yet, send a modified copy of PRegExTestMovie.dir with a new
	test added that demonstrates the problem.

	Be sure to state clearly in your report what you expected to
	happen, what did happen instead, and why you believe it's an error
	in the software.

    Bug reports that include a Lingo example that conclusively
    demonstrates the problem will get attention more quickly.

	Please be aware that we will be grateful for the reports, but may
	or may not have the time to reply.  


===============================================
PRegEx Licensing
===============================================

How did PRegEx get here?

	PRegEx is an "open-source" project.  


What do I get for free?

	You are free to use the accompanying version of the PRegEx Xtra in
	any way you see fit: in any project, for any purpose, at any time,
	now, or in the future, or in the past, free of charge.


Can I change the PRegEx source code?

    You may create derivative versions of the Xtra, or re-use any
    source code you find in it, but if you do so for pay or profit,
    you must provide the recipient with both the original, full, PRegEx
    package, including source code, along with any modifications you
    have made, including source code.  It would also be polite but not
    required to contribute the derived version back to the copyright
    holder via the contact information that you will find at
    http://openxtras.org/.


Is PRegEx supported or guaranteed to work?

	No! PRegEx is provided without support or warranty of any kind.  In
	particular, nobody guarantees that this code is fit for any
	purpose, or that it will not cause you and your customers great
	physical harm when you use it.  In fact, assume it will cause harm
	until you have tested it to your own satisfaction.  You accept all
	risks associated with using this software, should you choose to do
	so.
	  

Can I contribute?
	
	The best way you can contribute is to give YOUR TIME to test,
	review, use, verify, and debug this code, to make it better,
	stronger, faster, and more powerful for others.


Can I contribute financially?

    If you find that this Xtra was insanely useful, which you will,
    and then you also feel motivated to contribute $$ to help offset
    its considerable development costs and express gratitude for the
    hours and weeks of time it has saved you, or the impossible
    projects it made possible, please log on to http://openxtras.org/
    and select one of the contribution options shown there.
    Contributions will be used to help maintain the OpenXtras web site
    and anything left over will be used to feed and clothe the
    authors' families.


What about Shockwave?

	PRegEx is not currently Shockwave-safe, and the authors do not
	intend to do any work or spend any $$ to make it so.  However, you
	have the full source here.  You're free to accept the challenge --
	and the legal responsibility -- for making a Shockwave-safe
	version for whatever use you desire.  Just be sure you follow the
	guidelines laid out in this document if you distribute modified
	versions of PRegEx to anyone.


What about future versions?
	
	This liberal licensing policy may or may not apply to future
	versions of PRegEx created by Chris Thorman, the copyright holder.
	
	However, this liberal licensing policy will always apply to this
	and earlier versions and to any derivative works based on it/them.


-------------------------------------------------------------------------
Regular Expression Xtra Licensing Statement
Version 2.0
-------------------------------------------------------------------------

This is a Scripting Xtra for Macromedia Director which lets you use regular
expressions as implemented by PCRE http://pcre.org/, plus a whole lot more.

Written by:

      Chris Thorman <chris@thorman.com>
         Ravi Singh <ravi@ravware.com>

Copyright (c) 2001-2008 Chris Thorman

-----------------------------------------------------------------------------
Permission is granted to anyone to use this software for any purpose on any
computer system, and to redistribute it freely, subject to the following
restrictions:

1. This software is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2. The origin of this software must not be misrepresented, either by
   explicit claim or by omission.

3. Altered versions must be plainly marked as such, and must not be
   misrepresented as being the original software.

4. If PRegEx is embedded in any software that is released under the GNU
   General Purpose License (GPL), then the terms of that license shall
   supersede any condition above with which it is incompatible.

5. The PCRE and iconv and Director XDK components have their own
   licensing requirements, with which you obviously should comply.

(Thanks to Philip Hazel, creator of PCRE, for the above licensing statement.)
-----------------------------------------------------------------------------

=========================================
What's New in 2.0
=========================================

Mac OS X Universal Binary
-------------------------

    PRegEx is now a Universal binary.  That means it runs natively on
    Intel-based Macs, and also on older PowerPC (PPC) Macs, without
    emulation.  However, it is now Mac OS X only.  (In fact, it only
    supports v.10.4 ("Tiger") and later, the same as Director 11.)

Director 11+ Only
-----------------

	PRegEx 1.0 supported Director versions 7-10.  The older version
	does NOT work on Dir. 11+ (even if it seems to work on Windows).
		
Unicode
-------

	In Director 11, Macromedia changed the internal string format to
	UTF-8 (Unicode).  This is great news, but completely changes the
	way that PRegEx needs to work.  Here is a summary of the changes:

	Reading/Writing files:

	Reading files into memory and writing them back out again now
	requires careful attention to text encodings.  (In Director 7-10,
	all files were simply assumed to be MacRoman or Windows1252 files,
	whether they were or not, and this was OK).  The great news is
	that PRegEx now supports essentially *all* known text file formats
	(by fully incorporating the open-source iconv library), plus some
	additional custom formats that will be helpful to PRegEx users.
	See ReadFileToString, and WriteString to file for the details.
				
	Escape Codes:			

	PRegEx supports "interpolation" of special escape codes to
	generate special characters in strings.  Interpolation is used in
	3 places: Replace (in the replacement string), Translate (in the
	input and output mapping strings), and Interpolate.  In Director
	7-10, any 8-bit value was legal in strings.  In Director 11, all
	characters in strings must be valid UTF-8, or Director could
	crash.  So the meanings of the following escapes have changed:

		\200-\377 octal escapes - formerly inserted 8-bit char/byte, now Unicode code points 128-255
		\x80-\xFF hex escapes - formerly inserted 8-bit char/byte, now Unicode code points 128-255

	And these new escapes have been added:

		\400-777 new octal escapes for Unicode code points 257 through 511
		\x{0}-\x{7FFFFFFF} new hex escapes for *any* valid Unicode code points

	Please note that not all Unicode code points between 0 and
	7FFFFFFF are valid!  You should restrict yourself to valid Unicode
	code points as defined in the latest Unicode specifications.  

	Also note that the UTF-8 hexadecimal representations of Unicode
	characters are NOT the same as the Unicode code point numbers.
	For example, the correct Unicode code point specification for
	"cents" sign is U+00A2, which can be specified as \x{A2} or
	\x{00A2}. The 2 hex bytes C2A2 describe the UTF-8 encoding of that
	symbol, but the escape code \x{C2A2} can NOT be used to
	interpolate one of these values into a string.  PRegEx provides no
	way to expressly indicate the UTF-8 representation of a character.
	Director and PRegEx and PCRE and iconv always figure out the UTF-8
	encodings for you.

	These escape codes are the same as PCRE's octal and hexadecimal
	escape codes, so you can use the same encodings in both the Search
	and Replace strings of any PRegEx function.

	Translate Function:

    Because of how Unicode works, the Translate function can no longer
    work with non-ASCII characters.  Specifically:

		- Any non-ASCII characters in the InputTable and OutputTable
		  will simply be ignored, as if they were not present at all.

		- If used in a "range specifier", non-ASCII characters will
		  prevent the range from being recognized as a range.

		- Any non-ASCII characters in the SrchStrL (string being
          modified) will be untouched.  I.e. they will never be
          modified by the Translate function.

	Quotemeta function:
		  
    Quotemeta formerly would put a backslash in front of non-ASCII
    characters.  Now, it will not.  (Those characters are always
    literal in PCRE.)

	String lengths:

	As in Lingo, string lengths returned by PRegEx functions and
	accepted as arguments are always in terms of character length,
	never byte length.  (Prior to Director 11 and Unicode/UTF-8, these
	concepts were the same.)  For strings that are 100% ASCII, the
	lengths are the same.  For non-ASCII strings, the length in bytes
	is dependent on the UTF-8 representation.  

	The exception is when writing a string to a file: the return value
	is the size of file on disk, in bytes, and is dependent upon the
	character encoding chosen and the content of the string, possibly
	being higher or lower than the number of characters written.

Bug Fixes	
---------

Fixed in 2.0:

	- Calling join() with an empty list crashed the Mac (and maybe Windows)

	- Could not write file names longer than 31 characters

	- The "s" option did not always function correctly

	- An error message said "...with setting" rather than "...without
      setting".
	  	  
New build methodology (for building the Xtras from source)
----------------------------------------------------------

	- Better supports source control techniques
	- Uses modern development tools (XCode, VC++ 2003 as patched)
	- No longer has to worry about Mac resource forks (new OSX binary format)
	- Uses .zip format instead of .sit for distribution
	- See ReadMe.txt files in make_mac and make_win directories for details

=========================================
PRegEx Quick-Reference / Interface Summary
=========================================

A complete detailed description of all functions follows later in this
document.  This is just a summary for quick reference.

Housekeeping functions:
-----------------------

PRegEx_Clear           ([Complete]) ==> void; partial or complete reset
PRegEx_GetPRegExVersion () ==> Version string of PRegEx (e.g. "1.0")
PRegEx_GetPCREVersion  () ==> Version string of PCRE  (e.g. "3.4")


Search/Replace low-level interface:
-----------------------------------

PRegEx_SetSearchString     (SrchStrL)   ==> True or -Err
PRegEx_SetMatchPattern     (RE, [Opts]) ==> True or -Err
PRegEx_GetNextMatch        ([noBlastBR])==> True or -Err
PRegEx_ReplaceString       (ReplPat)    ==> True or -Err


Search/Replace high-level interface:
------------------------------------

PRegEx_Search        (SrchStrL, RE, [Opts]) ==> FoundCount or -Err
PRegEx_SearchExec    (SrchStrL, RE,  Opts, #Callback, [ArgList])
PRegEx_SearchBegin   (SrchStrL, RE, [Opts]) ==> 1 (success) or -Err
PRegEx_SearchContinue() ==> 1: Found; 0: Done; Negative: -Err

PRegEx_Replace		(SrchStrL, RE, Opts, ReplPat) ==> FoundCount
PRegEx_ReplaceExec	(SrchStrL, RE, Opts, #ReplFunction, [ArgList])


Search/Extract utilities:
-------------------------

PRegEx_Split               (SrchStrL, RE, [Opts, InitList, Max])=>List
PRegEx_ExtractIntoList     (SrchStrL, RE, [Opts, InitList])=>PList
PRegEx_ExtractIntoSPList   (SrchStrL, RE, [Opts, InitList])=>PList
PRegEx_ExtractIntoSPListSym(SrchStrL, RE, [Opts, InitList])=>PList


Match Status functions:
-----------------------

PRegEx_FoundCount     () ==> Running or final count of match events

PRegEx_GetPos         () ==> Char pos where last left off; next begins
PRegEx_SetPos         (num) ==> Change pos (0 <= Pos <= buffer len)

PRegEx_GetMatchBRCount() ==> Number of back refs in last matched RE

PRegEx_GetMatchString ([num]) ==> Last matched str (entire -or- BR #)
PRegEx_GetMatchStart  ([num]) ==> Start pos of ""  (entire -or- BR #)
PRegEx_GetMatchLen    ([num]) ==> Length of ""     (entire -or- BR #)


Error-handling functions:
-------------------------

PRegEx_LastErrCode		  () ==> Error code for last failed call
PRegEx_DescribeError       ([Err]) ==> Error msg (Err or LastErrCode)

PRegEx_CompiledOK          () ==> True if last expression compiled

PRegEx_MemError            () ==> True if last op failed due to memory
PRegEx_MemErrorSticky      () ==> True if any op has failed due to mem
PRegEx_MemErrorStickyReset () ==> Reset sticky err; return prev value

Preference flags:
-----------------

PRegEx_ErrorsToMessageWindow	([Bool]) ==> Echo all errors to Msg wind.


String-manipulation utility functions: 
--------------------------------------

PRegEx_QuoteMeta  (String) ==> String with RE-special chars quoted
PRegEx_Translate  (SrchStrL, InputTable, OutputTable) ==> ChangeCount
PRegEx_Interpolate(String, [VarsPList]) ==> String


List-manipulation utility functions:
------------------------------------

PRegEx_CopyList(ListOrPList, [Deep, InitList]) ==> CopiedListOrPList

PRegEx_Grep	  (List, RE, [Opts])         ==> NewList  ("PRegEx mode")
PRegEx_Grep	  (List, #Filter, [ArgList]) ==> NewList ("Filter mode")

PRegEx_Map	  (List, #MapFunction, [ArgList]) ==> MappedList

PRegEx_Sort	  (List, DeepCopy, #SortFunction, [ArgList]) ==> NewList
PRegEx_Reverse (List, [DeepCopy])	==> Reversed copy
PRegEx_Join	  (List, [DelimiterString]) ==> String

PRegEx_Keys    (PList, [InitList]) ==> KeyList
PRegEx_Values  (PList, [InitList]) ==> ValueList

PRegEx_GetSlice(List, Keys, [InitList]) ==> SliceList
PRegEx_SetSlice(List, Keys, Values) ==> List

PRegEx_PListToList       (PList, [InitList])	 ==> List
PRegEx_PListToListStrings(PList, [InitList])	 ==> List

PRegEx_ListToSPList      (List,  [InitPList]) ==> SPList
PRegEx_ListToSPListSym   (List,  [InitPList]) ==> SPList


General utility functions: 
--------------------------

PRegEx_ReadFileToString  (FilePath, TextEncoding)  ==> StringBufferList
PRegEx_WriteStringToFile (FilePath, TextEncoding, StringBufferList) ==> 1/-Err

Deprecated functions included for backward compatibility:

PRegEx_ReadEntireFile  (FilePath)  ==> StringBufferList
PRegEx_WriteEntireFile (FilePath, StringBufferList) ==> 1/-Err

Callback-related functions:
---------------------------

PRegEx_CallHandler   (#CallbackFunction, [ArgList1, ArgList2])

PRegEx_CallbackAbort([bool]) ==> Stop operation and fail with error
PRegEx_CallbackStop ([bool]) ==> Stop before this iteration, but succeed
PRegEx_CallbackLast ([bool]) ==> Stop after this iteration, but succeed
PRegEx_CallbackSkip ([bool]) ==> Skip this iteration, but continue


Error code constants:
---------------------
PRegEx_ErrCode_OutOfMemory()
PRegEx_ErrCode_SearchStrLMustBeList()
PRegEx_ErrCode_SearchStrLMustContainString()
PRegEx_ErrCode_SearchStrLLengthArgMustBeInteger()
PRegEx_ErrCode_REMustNotBeEmpty()
PRegEx_ErrCode_REDidNotCompile()
PRegEx_ErrCode_ReplPatMustBeString()
PRegEx_ErrCode_CallbackFuncMustBeSymbol()
PRegEx_ErrCode_CallbackFuncDidNotReturnString()
PRegEx_ErrCode_QuoteMetaNeedsString()
PRegEx_ErrCode_TriedToMatchWithoutSearchStrL()
PRegEx_ErrCode_TriedToMatchWithoutSearchPattern()
PRegEx_ErrCode_TriedToReplaceWithoutMatching()
PRegEx_ErrCode_CallbackRequestedAbort()
PRegEx_ErrCode_UnexpectedMOAError()
PRegEx_ErrCode_UnexpectedInternalError()
PRegEx_ErrCode_CallbackFunctionNotFound()
PRegEx_ErrCode_ExpectedListArgument()
PRegEx_ErrCode_ExpectedPListArgument()
PRegEx_ErrCode_GrepNeedsFunctionNameOrPRegEx()
PRegEx_ErrCode_ExpectedStringArgument()
PRegEx_ErrCode_SortFunctionDidNotReturnInteger()
PRegEx_ErrCode_ListIndicesMustBeIntegers()
PRegEx_ErrCode_FileNotFound()
PRegEx_ErrCode_ErrorOpeningFile()
PRegEx_ErrCode_ErrorReadingFile()
PRegEx_ErrCode_ErrorWritingFile()


Perl-ish shorter function names:
-------------------------------

	These perl-friendlier "aliases" to certain of the PRegEx functions
	have been provided.  Their syntax is more evocative for Perl
	programmers, and others will appreciate their brevity.

	re_m			==>	PRegEx_Search	(aka "match")
	re_s			==>	PRegEx_Replace	(aka "substitute")
	re_search		==>	PRegEx_Search
	re_replace		==>	PRegEx_Replace

	re_get			==> PRegEx_GetMatchString
	re_pos			==> PRegEx_GetPos

	re_extract		==>	PRegEx_ExtractIntoList
	re_extractp		==>	PRegEx_ExtractIntoSPList
	re_extractps	==>	PRegEx_ExtractIntoSPListSym

	re_call			==>	PRegEx_CallHandler
	re_abort		==>	PRegEx_CallbackAbort
	re_stop			==>	PRegEx_CallbackStop
	re_last			==>	PRegEx_CallbackLast
	re_skip			==>	PRegEx_CallbackSkip

	re_quotemeta	==>	PRegEx_QuoteMeta
	re_tr			==>	PRegEx_Translate
	re_i			==>	PRegEx_Interpolate

	re_split		==>	PRegEx_Split
	re_join			==>	PRegEx_Join

	re_grep			==>	PRegEx_Grep
	re_map			==>	PRegEx_Map
	re_sort			==>	PRegEx_Sort
	re_reverse		==>	PRegEx_Reverse
	re_copy			==>	PRegEx_CopyList

	re_keys			==>	PRegEx_Keys
	re_values		==>	PRegEx_Values

	re_slice		==>	PRegEx_GetSlice
	re_slice_set	==>	PRegEx_SetSlice

	re_list			==>	PRegEx_PListToList
	re_list_strs	==>	PRegEx_PListToListStrings

	re_hash			==>	PRegEx_ListToSPList
	re_hash_syms	==>	PRegEx_ListToSPListSym

	re_read2		==>	PRegEx_ReadFileToString
	re_write2		==>	PRegEx_WriteStringToFile

	re_read			==>	PRegEx_ReadEntireFile  (NOTE: deprecated)
	re_write		==>	PRegEx_WriteEntireFile (NOTE: deprecated)

	re_err			==>	PRegEx_LastErrCode
	re_debug		==>	PRegEx_ErrorsToMessageWindow


=========================================
PRegEx Return Values: General Principles
=========================================

Unless otherwise noted, search/replace related functions return an
integer saying how many matches were successfully made, even if fewer
replacements were completed due to some being skipped by the program.

For functions returning match counts, a return value of 0 means
successful operation, but means that 0 matches were found (and of
course 0 replacements were done).

Any NEGATIVE INTEGER returned by any function is an ERROR CODE, which
may be interpreted using the Error-related features of PRegEx,
described Later.

Some functions return a 1 meaning successful completion or a negative
error code if an error occurred.

Consequently, you should never treat the return of PRegEx functions as
Booleans when checking whether a match was done, because Lingo
considers all non-zero numbers, even negative numbers, to be "true".

Instead, you should check integer results for being > 0 or > -1,
depending on your interest.

Wrong:   if (PRegEx_Search(str, "foo", "g")    ) then print "Found!"

Right:   if (PRegEx_Search(str, "foo", "g") > 0) then print "Found!"

Most functions that do not ordinarily return integers will either
return void or empty strings or empty lists when there is an error
encountered, and their error code is then set in the LastErrCode flag,
which may subsequently be queried.

Remember, a failure to match is never an "Error" from PRegEx's point of
view.  An "Error" always means a parameter error, syntax error, or
runtime error, such as memory or disk problems.  A failure to match is
viewed as the successful completion of a match request whose answer
happened to be "zero matches".


=========================================
PRegEx Parameter: General Descriptions
=========================================

In all function prototypes shown above and below, sample argument
names are used consistently to represent arguments of a particular
type or meeting certain criteria.  For example, "RE" always means a
Regular Expression string, "Opts" always means a 0-7-character string
of option flags, etc.

This section is a glossary explaining each of these standard argument
types.  Unless otherwise noted, the descriptions here apply to all
functions in which these named parameters appear.


RE -- Regular Expression pattern

Example: "(dog)|(cat)"

This is a simple Lingo string containing literal characters and/or
special character sequences that specify what is to be searched for.

See above section, and PCRE and/or Perl documention for precise
details of the RE syntax.


Opts -- Options string

Example: "gisx"

A string of 0-7 option flag chars in any order.  

Any other type of argument is treated like an empty string ("") and
results in all options being turned off.  Any other characters in Opts
are silently ignored.

The 7 option flags are:

	 Pattern matching flags:

	  i == case Insensitive matching 

		   Corresponds to PCRE option PCRE_CASELESS 

	  s == "Single line" mode (. and \s match newline)

		   Corresponds to PCRE option PCRE_DOTALL

	  m == "Multi line" mode (^ and $ match internal line start/end)

		   Corresponds to PCRE option PCRE_MULTILINE 

	  x == eXtended mode

		   Ignores whitespace in patterns; allows comments.

		   Corresponds to PCRE option PCRE_EXTENDED

	 Behavior control flags:

	  t == sTudy; optimize the PRegEx by "Studying" it first.

	  g == Global; re-do Srch or Srch/Repl till no more match

	  e == Exec; call a callback function on each iteration
		   (see also SearchExec, ReplaceExec, SearchBegin.)


SrchStrL -- String to be searched ("String Buffer List")

Examples: 
		  
	["my data my data my data"]        -- string only
    ["my data my data my data", 23]    -- with optional length
    ["my data my data my data", 23, 0] -- 0 means no NUL chars

You must pass search string buffers to PRegEx in a special, arguably
unusual, way.  Instead of passing a string as you normally would when
calling a Lingo command, you pass a LIST CONTAINING A STRING upon
which searching/replacing commands can operate.

SrchStrL is a regular Lingo list.  The operation occurs on the FIRST
ELEMENT of the list.  If SrchStrL is not a list, it's a param error. A
non-string first element or an empty list is considered a parameter
error. 

The second, optional, element of the list is a length value.  If
supplied, it is taken to be the intended length (even if not the
actual length) of the first element.  Of course, this value should be
no greater than length(SrchStrL[1]). and no less than zero.

The length value is always specified as a count of *characters*.  When
non-ASCII characters are used (e.g. special, accented, or non-Roman
characters), then the length of the string in *characters* may not be
the same as the length of the string in memory, or of the length of
the string when saved in a file.

String buffer to be searched may contain any amount of binary data,
including ascii zero (NUL), which does NOT signify end-of-string.
(However, you should be aware of bugs in Director's Message and Debug
windows which incorrectly display string buffers that have NULs in
them as if the buffers were truncated at that position.  Don't worry:
the data is still in the buffer even if it is printed out wrong.)

Supplying the length element overrides the Xtra's perceived length of
the buffer.  This allows the search or other operation to take place
on a reduced subset of the string. (Warning: doing a replace on this
string will truncate it at the specified point.  Writing a file from
this string will also truncate the resulting file.)

The third, optional, element is a boolean integer (0 or 1) which says
whether the string buffer in element 1 is known to contain NUL
characters (this is set for you by ReadFileToString, for your
convenience, because you may want to use the data with
non-NUL-friendly Xtras and it will be helpful to know if it has
"binary" data that could trump them up).  Its value is never observed
by PRegEx and so does not alter the behavior of any PRegEx functions
-- all PRegEx functions are NUL-safe.  They never assume that your
data does not contain NULs.

Other elements of the list, if any, are left untouched by any
functions that modify your SrchStrL.  Normally, you would not use this
list for storage of other data.

WHY THE LIST/STRING APPROACH?  Storing the string in a list is how we
do pass-by-value to minimize copying of the string, and also allow you
to hold the string in a single, named Lingo variable, while calling
multiple Search and/or Replace commands that will modify the string
buffer in place for you without replacing or renaming your variable.
This also allows you to pass your string buffer around from one Lingo
function to another and to PRegEx functions without copies of the
string data getting made each time you make a function call.

This sample would read a tab-delimited file of settings and values:

set File = PRegEx_ReadEntireFile("@:SettingsFile.txt") -- is a SrchStrL
PRegEx_Replace(File, "(\x0D\x0A)|[\x0D\x0A]", "g", "\n") -- line ends
PRegEx_Replace(File, "\n+", "g", "\n") -- remove blank lines
PRegEx_Replace(File, "\t+", "g", "\t") -- multiple tabs --> single tab
set SettingProps = PRegEx_ExtractIntoSPList(File, "(.*?)[\t\n]", "g")


ReplPat -- Replacement string or pattern

Example: "Date: \1 Time: \3 Place: \2\n"
	
This is the replacement string for any PRegEx functions that do
replacing. It can be a simple string, OR it may also contain special
escape sequences to specify backreferences \1, \2, etc. or other
special characters.

Here is a complete list of special "escape codes" recognized within
ReplPat string:

\\            a single backslash character
\t            a single tab character (same as numtochar(9))
\n            a single newline character (aka Lingo "return" constant; aka numtochar(13); aka Macintosh newline; aka Carriage Return; aka CR)
\x##          a single UTF-8 character with 2-digit hex value, range 00 - FF (Unicode Code point number)
\x{#.......}  a single UTF-8 character with 1-8-digit hex value, range 0 - 7FFFFFFF (Unicode Code point number)
\0            a single UTF-8 NUL character (aka ASCII zero byte)
\# or \##     insert backreference by number (only recognized in replacement strings or after a match) (backslash followed by 1 or 2 digits), range 1-99
\###          a single UTF-8 character with three-digit octal value, range 000-777 (Unicode Code point number)
\(other char) insert the character itself. (e.g. \b = literal "b")
${stringkey}  string key lookup in optional caller-supplied property list (value must be a string)
${#symbolkey} symbol key lookup in optional caller-supplied property list (value must be a string)

The process of interpreting these escape sequences and converting them
into the actual output string is called "interpolation".  It is done
automatically on replacement strings, and may also be done explicitly
by calling the PRegEx function PRegEx_Interpolate().  (It is also done
in the Table arguments to Translate.)

Don't get confused: these sequences are not generally recognized by
Lingo; they are only interpreted within PRegEx search patterns (REs)
and replacement patterns (ReplPats), and by PRegEx_Interpolate().


InitList

For most PRegEx functions whose purpose is to create a list, an
optional InitList parameter may be specified.  If specified, then the
function will begin with that list and modify it, rather than creating
a new list for you.  Otherwise, all list-generating functions
automatically begin with a new, empty, list.

This allows you to progressively build up a list through several
invocations of PRegEx_ routines, or to use any PRegEx_ routines to
append items to an existing list.


ArgList

For any functions that take Callback functions, they also take an
optional ArgList argument (which defaults to [], the empty list).  The
values inside the ArgList will be passed to the callback function,
AFTER any other task-centric values that must be passed.

So, for example, a #FilterFunction that must take a single argument
and return a boolean saying whether that argument should be "in" or
"out", gets passed item to be filtered as its first argument, PLUS
additional arguments, if any, are taken from the supplied ArgList.

Additional arguments could include data to be compared against, or
perhaps other lists or property lists or instance objects that can be
used to access a database or other external resources, or to serve as
persistent state between multiple calls to the callback function.

Using ArgList is a good practice because it lets you call callback
functions by name without relying on global variables to communicate
with those functions -- pass any parameters the function needs in
order to operate in ArgList rather than using globals.


#ReplFunction -- Callback function for replacement

The SYMBOL name of a Lingo handler to be called during one of the
_Replace* commands.

The function is called EVERY time the command makes a successful match
(0 or 1 time if "global" option is off; 0 or more times if "global" is
on).

The return value, which MUST be a string, is inserted as the
replacement text.

The replacement command itself does not pass any arguments to the
function, but you may specify an optional ArgList parameter, whose
elements, if any, will be passed, each time, as arguments to
#ReplFunction.

#ReplFunction may request backrefs or the entire match string by
calling PRegEx_GetMatchString(N), and may discover which of multiple
iterations it is on by calling PRegEx_FoundCount().

Note that there is no way for the #ReplFunction to know whether it is
being called for the last time during a global replace (there is no
final "cleanup" call).

As with all callback functions in PRegEx, #ReplFunction may signal to
the function that is calling it that the function should abort, stop,
skip, or "last" -- see PRegEx_CallbackAbort, etc.

Example of typical uses:  

	- selectively replace based on calculated criteria

	- terminate a replacement early based on calculated criteria

	- look up or translate symbols from a property list or database at
	  runtime and insert them into the correct locations in a buffer.

	- extract some data before/while it is being replaced


#Callback -- General-purpose callback function

This is the symbol name of a Lingo handler in the Movie scope that
will be called, generally with arguments optionally supplied by the
calling routine, and may do anything it wishes, but should avoid
actions that would stop playback or otherwise terminate the caller's
context.


=========================================
PRegEx: Detailed Function Descriptions
=========================================

Note: common parameters are described in detail in the section above.
That information is not generally reiterated in the descriptions
below.

Housekeeping functions:
----------------------

PRegEx_Clear           ([Complete]) ==> void; partial or complete reset

	Clears internal state, search strings, back references, buffers,
	error codes, etc, except for MemErrorSticky.

	"Complete" option also clears call stack, if any, callback flags,
	and other info.  DO NOT USE "Complete" option except when first
	starting up.

	Clear is automatically called by all high-level search/replace
	functions, so you should never need to use it.

PRegEx_GetPRegExVersion () ==> Version string of PRegEx (e.g. "1.0")
PRegEx_GetPCREVersion  () ==> Version string of PCRE  (e.g. "3.4")

	As described.


Search/Replace low-level interface:
-----------------------------------

	Note: For best results, avoid using these "low-level" routines
	directly.  They are really intended only for someone who needs to
	directly control the individual steps of setting up a search
	and/or replace, or who, for efficiency reasons, would like to keep
	a single SrchStrL variable and repeatedly apply multiple REs to
	it.  The low-level routines ignore the "global" option.  They
	assume the caller wants to control multiple matches.

PRegEx_SetSearchString     (SrchStrL)   ==> True or -Err

	Sets a new string to be operated on.  Resets all counters and
	buffers and flags, except the match pattern.  Resets Pos to zero.
						  

PRegEx_SetMatchPattern     (RE, [Opts]) ==> True or -Err

	Initializes engine and then compiles new RE.  Sets Opts for
	subsequent operations.  Resets all counters and buffers and flags,
	except the search string.  Resets Pos to zero.


PRegEx_GetNextMatch        ([noBlastBR])==> True or -Err

	Performs one single search event in the current string, using the
	current pattern and options, beginning at the current Pos, either
	the Pos left from the immediate previous search (of any kind), or
	from a Pos you determine by first using SetPos().
						  
	When GetNextMatch succeeeds, any previous global back-reference
	data is replaced by the new back-reference data (see "Match Status
	Functions" below).

	When it fails, all back-reference buffers are cleared out and
	MatchStatus functions will all return zero/empty/void.

	The optional noBlastBR argument tells GetNextMatch to not blow
	away the back-reference buffers when it FAILS, but instead, to
	keep the information there from the previous successful match.

	Important special case: If Entire Match is zero-length (i.e. a
	match succeeded but matched string had no length), then Pos will
	be increased before next the iteration; this guarantees that a
	global match will terminate by stepping through the string
	character-by-character rather than spinning endlessly at the
	starting position.  This behavior applies to all matching
	functions in PRegEx.


PRegEx_ReplaceString       (ReplPat)    ==> True or -Err

	ONLY AFTER a successful match, replaces the entire matched segment
	with ReplPat, after "interpolations" have been performed
	(i.e. inserting back references or other special escape sequences
	into a copy of ReplPat before then inserting the resulting string
	into the search buffer). 

	Note that all Replace functions in PRegEx MODIFY the original
	buffer.  They never return a copy.


Search/Replace high-level interface:
------------------------------------

	You should almost always choose to use these "high-level"
	functions and avoid the "low-level" interface whenever possible.

	Only the high-level functions are aware of the "g" (global) flag.

	These "high-level" search/replace functions, and any other
	functions that use SrchStrL, RE, or Opts arguments, always
	interally call the low-level functions listed above, or their
	equivalents, as needed to perform their documented tasks.

	Their function is abstractly described here partially in terms of
	the low-level functions above; and these routines have the same
	effect as if they were implemented by actually calling the
	low-level routines.

	However, in actual fact, they may or may not be implemented
	exactly that way; for example, doing a global replace is
	implemented more efficiently by doing all the searching in one
	shot and then all of the replacing, rather than by repeatedly
	calling GetNextMatch and ReplaceString.

	Consequently, do not rely on any particular assumptions about the
	contents of a string buffer DURING the course of operation of a
	single high-level Replace (say, for example, inside a callback
	function being called in the middle of a global Replace).


PRegEx_Search        (SrchStrL, RE, [Opts]) ==> FoundCount or -Err

	Sets up and does a search, comparing SrchStrL to RE.

	If Global, the search is repeated continuously until it cannot
	match anymore.

	Afterwards, the Match Status functions only return information
	pertaining to the LAST successful search done.  If there were zero
	matches, then the Match Status information will all be empty/void.

	Returns the FoundCount or Err code.  In non-global mode, this will
	be 0 or 1, but should not be.  In global mode it will be 0 or
	higher and can be treated as a count of the number of entire
	matches.

	If "e" (exec) option is supplied, then Search behaves exactly like
	SearchExec, documented below.

	Equivalent to:

	- Call PRegEx_SetMatchPattern; or fail if error

	- Call PRegEx_SetSearchString; or fail if error

	- Call PRegEx_GetMatch 1 time or until search fails if global;
	  return Err if error; Retain back refs from ultimate successful
	  search when in global mode).

	- return PRegEx_FoundCount()

PRegEx_SearchExec    (SrchStrL, RE,  Opts, #Callback, [ArgList])

	Like PRegEx_Search, but takes a #Callback function, which is
	called, with arguments from optional ArgList, after each
	SUCCESSFUL match that takes place.  Callback may use any of the
	Match Status functions to inquire about the current match.


PRegEx_SearchBegin   (SrchStrL, RE, [Opts]) ==> 1 (success) or -Err
PRegEx_SearchContinue() ==> 1: Found; 0: Done; Negative: -Err

	These two functions are used as a pair if you want to execute some
	Lingo code in-line, each time a successful match takes place, like
	this:

		  if (PRegEx_SearchBegin(str, "(\w+)", "g") > 0) then
		    repeat while (PRegEx_SearchContinue() > 0)
			  put PRegEx_MatchString(1);
			  if PRegEx_FoundCount() > 3 then exit repeat
			end repeat			  
		  end if

PRegEx_Replace		(SrchStrL, RE, Opts, ReplPat) ==> FoundCount

	Sets up and performs a single or global search and replace in
	SrchStrL using RE and Opts.  ReplPat is interpolated and inserted
	on each successful match.

	If "e" (exec) option is supplied, then Replace behaves exactly
	like ReplaceExec, documented below (ReplPat is replaced by an
	executable #ReplFunction, with optional argument list).

PRegEx_ReplaceExec	(SrchStrL, RE, Opts, #ReplFunction, [ArgList])

	Like Replace, but instead of using a fixed ReplPat string, calls
	#ReplFunction, optionally suplying any arguments from ArgList.
	(Note: Replace does NOT supply any information about the match
	directly to #ReplFunction.  #ReplFunction should use any of the
	MatchStatus routines for that information, if needed.

	#ReplFunction is REQUIRED to return a string each time it is
	called.  Failure to do so causes immediate termination of
	ReplaceExec, with an error code being returned.

	The string returned by #ReplFunction is used as the replacement
	for the entire matched string.  Returning the empty string, then,
	causes the matched string to be deleted from the string buffer.
	Returing PRegEx_GetMatchString(0), causes the original string to
	replace itself, essentially skipping this replacement.

	The string returned by #ReplFunction is not subject to
	interpolation, but rather inserted literally into the buffer. So
	don't try to return "Joe \1 Blow" and expect \1 to convert into
	back-reference.  But you could call "Interpolate" specifically:
	return(PRegEx_Interpolate("Joe \1 Blow")).

	The #ReplFunction may and should use the Callback-related
	Abort/Stop/Skip/Last flags, described later, in order to signal
	ReplaceExec to alter its default looping behavior.


Search/Extract utilities:
-------------------------

	Searching with parentheses and then checking back-references is
	the standard way to retrieve searched/matched data from a string
	buffer.  The Search and Replace functions, combined with the Match
	Status functions, make it easy to extract values one at a time or
	in small clusters.

	The Search/Extract utilities, on the other hand, provide
	convenient ways to extract an arbitrary number of data values from
	a string buffer in one or a few quick operations.  Please study
	the purpose of these functions since they're almost always more
	convenient than the simple search functions:


PRegEx_Split               (SrchStrL, RE, [Opts, InitList, Max])=>List


	"Splits" a string buffer, using the pattern specified in RE as a
	delimiter.  The matched portions of the string are REMOVED, and
	the intervening segements are extracted into a list.  
	
	However, if the RE contains backreferences, then ALL of the
	backreferences generated by the RE, in numeric order, will be
	inserted, each as a separate element, into the resulting list at
	the appropriate point in the list.  This allows retention of all
	the matched portions of the original string, as well.

	Here's another way to think about Split: it's the same as
	PRegEx_ExtractIntoList, but in addition to extracting the
	backreferences from each match, also adds all of the strings
	BETWEEN each matched segment, effectively "split"ting the string
	into multiple strings.

	Optional MaxItems argument, which must be 2 or greater to be
	meaningful, limits the maximum number of items that the list will
	be split into.  (i.e. limits the max number of successful matches
	to (MaxItems - 1)).  Omitting the optional Opts argument or
	omitting the "g" flag from Opts has the same effect as setting Max
	= 2 because only one match will be performed and the string will
	be split into two parts.

	If MaxItems is zero or unspecified, Split will remove any empty
	trailing items that would result if the delimiter RE is found to
	match at the very end of the search string.  In other words,
	splitting "1,2," on comma would yield ["1", "2"].  However, if
	MaxItems is ANY NEGATIVE NUMBER, then empty trailing items will
	not be removed and the result would be ["1", "2", ""].  Note: in
	order to be able to pass MaxItems, you'll be forced to also pass
	values for Opts and InitialList.  These can be defaulted to "" and
	[], respectively.

	Examples:

	put PRegEx_Split(["1 2 3"], "\s+", "g") -- splitting whitespace
	- ["1", "2", "3"]

	put PRegEx_Split(["1 2 3"], "\s+", "g", [], 2) -- max 2 items
	- ["1", "2 3"]

	put PRegEx_Split(["1 2 3"], "(\s+)", "g") -- keeping whitespace
	- ["1", " ", "2", " ", "3"]

	put PRegEx_Split(["1 2 3"], "(\w+)", "g", [],  0) -- delim @ start,end
	- ["", "1", " ", "2", " ", "3"] -- note "" at start, but not end

	put PRegEx_Split(["1 2 3"], "(\w+)", "g", [], -1) -- note Max = -1
	- ["", "1", " ", "2", " ", "3", ""] -- note "" at start, AND at end


PRegEx_ExtractIntoList     (SrchStrL, RE, [Opts, InitList])=>PList

	Does a global or non-global search, putting ALL MATCHED BACK
	REFERENCES (omitting non-matched ones, but keeping empty matches)
	from each iteration into a lingo list; if global, repeats until
	matching fails, gathering up all the back references from all
	iterations along the way.

	Equivalent to the following:

	- Start with InitList or create an empty list to hold elements.

	- Enter a Begin/Continue loop; if errors, return empty list.

	- On each iteration, call PRegEx_GetMatchBRCount to count backrefs

	- For each back reference: 
	   - Call PRegEx_GetMatchString
	    - If error, abandon partial list & return an empty list.
	  - Insert string into list
	- Return list.


PRegEx_ExtractIntoSPList   (SrchStrL, RE, [Opts, InitList])=>PList
PRegEx_ExtractIntoSPListSym(SrchStrL, RE, [Opts, InitList])=>PList

	These Extract routines are the same as PRegEx_ExtractIntoList, but
	using a sorted property list; strings extracted using the current
	set of matched backreferences are inserted pairwise into the list.

	Here is how it works... as each complete pair is retrieved:

	- Use first item in pair as the key, second item as the value.

	- Add/Replace an entry into the SPList

	- If odd number of items, then use <void> as final value.

	The properties generated by ExtractIntoSPList are "String"
	properties, which IS allowed in Lingo, and can be absolutely any
	string.

	ExtractIntoSPListSym is identical except that it converts all
	property strings to symbols before inserting them into the list.
	Consequently, it is imperative to ensure that all strings destined
	to become properties can actually be converted into legal Lingo
	symbols.  

	(Lingo places many restrictions on what characters may legally
	appear in property names (aka symbols).  It is your repsonsibility
	to ensure the input is going to be clean, or some funky, broken,
	or illegal symbols could result.)

	Examples:

	put PRegEx_ExtractIntoSPList   (["c d b a", (\w+), "g"])
	-- ["a":"b", "c":"d"]

	put PRegEx_ExtractIntoSPListSym(["c d b a", (\w+), "g"])
	-- [ #a:"b",  #c:"d"]


Match Status functions:
-----------------------

	These functions return information about the last successful match
	AND any backreference substrings that are available due to the use
	of parentheses inside the RE.


PRegEx_FoundCount     () ==> Running or final count of match events

	This returns the number of matches completed by a previous search
	even, or done up to this point in an ongoing search.

	Always re-set to 0 at the start of any match-related function
	except GetNextMatch itself.  Incremented by 1 each time a match
	happens, and always before any callback routines, so callback
	routines may call this to find out the iteration count of a global
	search IN PROGRESS.
	
	Note: this function does not count backreference matches.  It
	counts each entire successful match as one event, regardless of
	the number of successful backreference matches each might have had
	within it.


PRegEx_GetPos         () ==> Char pos where last left off; next begins
PRegEx_SetPos         (num) ==> Change pos (0 <= Pos <= buffer len)

    "Pos" is the character offset within the currently-active SrchStrL
    of where the current or most recent successful match STOPPed
    (which is also the beginning point for the next attempted match,
    unless the string buffer or PRegEx are replaced.

	GetPos returns this value.

	SetPos lets you set the Pos for the following GetNextMatch either
	ahead or backward.  SetPos(0) would always restart from the
	beginning.  The legal bounds of Pos are 0 <= Pos <=
	length(SrchStrL[1])).

	Generally, it is recommended that you avoid calling SetPos during
	the midle of any of the high-level Search/Replace routines,
	especially the Replace routines, or unpredictable results could
	occur.  Instead, call SetPos() only when working with the
	low-level interface routines.

	High-level routines always re-set Pos to zero before they start,
	because they internally call the low-level routines
	SetMatchPattern and SetSearchString, which have this effect as
	well.

	Recommendation: instead of ever using GetPos or SetPos, use the
	power of REs to extract the data you need based on its pattern and
	nearby context, rather than trying to search at specific character
	positions within a buffer.


PRegEx_GetMatchBRCount() ==> Number of back refs in last matched RE

	Returns the number of backreference-generating parenthesis pairs
	that were in the currently-successfully-matche RE.

	This number serves as the upper bound of the "num" argument to the
	following routines -- i.e. it gives the number of the
	highest-available numbered back reference from the current match.


PRegEx_GetMatchString ([num]) ==> Last matched str (entire -or- BR #)
PRegEx_GetMatchStart  ([num]) ==> Start pos of ""  (entire -or- BR #)
PRegEx_GetMatchLen    ([num]) ==> Length of ""     (entire -or- BR #)

	These return the entire string, its start position within the
	original buffer, and its length, for the Entire Match, or, if num
	is supplied and > 0, for any numbered backreference string.

	If GetMatchString and GetMatchLen return "" and 0, respectively,
	it means the corresponding match string was a successful match,
	but empty, and GetMatchStart will still give the correct offset of
	that matched position.

	If they return void, it means that there is no corresponding
	successful match, and GetMatchStart will also return void.

	For example:

	put PRegEx_Search(["Ravi is a nice guy"], "((Chris)|(Ravi))")
	-- 1

	put PRegEx_GetMatchString(0)
	-- "Ravi"

	put PRegEx_GetMatchString(1)
	-- "Ravi"

	put PRegEx_GetMatchString(2)
	-- <Void>        	-- 2nd set of parens did not kick in

	put PRegEx_GetMatchString(3)
	-- "Ravi"

	You can use this to check which of several alternate cases in
	a match pattern was the successful one:
			
	if PRegEx_GetMatchString(2) = void then put "Ravi matched."
	if PRegEx_GetMatchString(3) = void then put "Chris matched."
	-- "Ravi matched."


Error-handling functions:
-------------------------

PRegEx_LastErrCode		  () ==> Error code for last failed call

	Yields the numeric error code generated by the immediate previous
	PRegEx function call.

	0 means success.  All other codes are negative values.

	Some functions return their error codes, and LastErrCode() will
	agree with those; others do not return integers, and so checking
	LastErrCode() is the only way to check the exact error in case
	they return an unexpected result.

PRegEx_DescribeError       ([Err]) ==> Error msg (Err or LastErrCode)

	Given an Error code, returns a string message explaining it.
	
	If no Err is supplied, then describes PRegEx_LastErrorCode()

	Returns empty string if the Error code is zero (success).

	Example:

	put PRegEx_DescribeError(PRegEx_ErrCode_SearchStrLMustBeList())
	-- "PRegEx: SearchStrL argument must be a Lingo list."


PRegEx_CompiledOK          () ==> True if last expression compiled

   Returns true if and only if the last attempted compilation of a
   regular expression succeeded, even if there have been other
   intervening errors since then.


PRegEx_MemError            () ==> True if last op failed due to memory

   Returns true if the last PRegEx function generated a memory error.
   
   Each new PRegEx function call resets this value.


PRegEx_MemErrorSticky      () ==> True if any op has failed due to mem
PRegEx_MemErrorStickyReset () ==> Reset sticky err; return prev valuex

   MemErrorSticky() returns true if ANY PRegEx function has generated a
   memory error at any point since the last call to PRegEx_Clear(1)
   ("Complete" reset), or since the last call to
   PRegEx_MemErrorStickyReset(), which turns off this flag until the
   next memory error occurs.
   
   This flag could be checked after a long sequence of PRegEx calls to
   see if there was a problem encountered.  Or, it could be checked
   every time through an idle loop, perhaps.


Preference flags:

	Functions listed in this section act as both the Get() and
	Set(1/0) functions for the correspondingly-named preferences.
	(Call with no arguments to Get() the value, and call with 1
	argument to Set the value, which is also returned to you.)


PRegEx_ErrorsToMessageWindow ([Bool]) ==> Echo all errors to Msg wind.

	Tells PRegEx to echo the string description of any error codes
	generated by any PRegEx routine directly to the message window
	immediately as they occur.  This can be left on all the time, if
	desired, since it will have no effect during projector playback,
	since projectors lack a message window.


String-manipulation utility functions: 
--------------------------------------

PRegEx_QuoteMeta (String) ==> String with RE-special chars quoted

	Takes a Lingo string and returns a copy of the string with any
	potentially special "meta" characters "quoted" ("escaped") by
	having a backslash inserted in front of them.  This makes the
	string "safe" to use in an RE, even when its contents or origin
	cannot be known or trusted in advance (e.g. searching for
	user-supplied data with a potentially untrusted user, or any time
	when you know you want to search literally for a string that might
	have special characters in it and you may or may not know that in
	advance.  Maybe you want to search for "?" or backslash, for
	example).

	The characters that get escaped are EVERY CHARACTER EXCEPT a-z,
	A-Z, 0-9, and underscore, and non-ASCII characters.

	As a special case, NUL characters in the input are escaped as
	"\0", so the output of QuoteMeta is 100% compatible with the
	ReplPat argument to the Replace functions.

	In other words, the QuoteMeta function is equivalent to this Lingo
	example (except it does NOT have the side effect of modifying the
	current search string, pattern, or Match Strings etc. as calling
	PRegEx_Replace would do):

	on QuoteMeta String
	  set myStr = [String]
	  PRegEx_Replace(myStr, "([^A-Z_0-9\x{7F}-\x{7FFFFFFF}])", "gi", "\\\1")
	  PRegEx_Replace(myStr, "\0",           "g",  "\\0" )
	  return myStr[1]
	end QuoteMeta

	Note: PRegEx_Interpolate can be used to reverse the processing done
	by QuoteMeta.


PRegEx_Translate(SrchStrL, InputTable, OutputTable)

	Converts chars in SrchStrL using the mapping specified.

	InputTable and OutputTable are a pair of strings specifying
	input-chars and corresponding output-chars; any input-char
	mentioned in SrchStrL will be mapped to the corresponding
	output-char.  Others will be untouched.

	Dashes can be used in InputTable and OutputTable to signify a
	range of characters.

	Example: 

	PRegEx_Translate(SrchStrL, "a-z", "n-za-m") -- Rot13 encode/decode

	Supports interpolation of \t, \n, \0, \\, \xDD for hex, \123 for
	octal in the InputTable and the OutputTable.  But, does NOT
	support back-reference interpolation as that would almost never be
	helpful.  \# and \## are ignored, consequently, except for \0.
	Does NOT support variable and symbol interpolation syntax.
	"Translate" has its own, different, syntax.  Non-ASCII characters
	will be interpolated but then ignored (see note below).

	InputTable and OutputTable may contain ascii-zero (NUL)
	characters.

	If you want to mention a literal dash in either the InputTable or
	OutputTable, that character must either be the first or last
	character in the table, where it couldn't possibly be interpreted
	as a range specifier.

	If for any reason there are fewer characters in the Output table
	than in the Input table, then the last character is understood to
	be replicated as necessary.

	Examples: 

	PRegEx_Translate(SrchStrL, "-.", "M") -- dash or dot become M

	PRegEx_Translate(SrchStrL, "\000-\177", "\177-\000") -- invert all ASCII chars

	PRegEx_Translate(SrchStrL, "a-zA-Z", "_") -- all alpha chars become underscores
	
	Returns number of characters that changed; 0 if none did; or a
	negative error code if there is an error in the parameters.

	Translate ONLY works with ASCII characters (also known as Unicode
	code points zero through 127 also known as "7-bit" characters).
	This means: Only ASCII characters are recognized in Input and
	Output tables -- non-ASCII characters are ignored, but will
	disrupt interpretation of "range" specifiers.  Non-ASCII
	characters in SrchStrL (the string being altered) will always be
	completely ignored.  (Yes, this is a step *backward* from PRegEx
	1.0 functionality, but it is an unavoidable consequence of using
	Unicode rather than a fixed 8-bit character set.)  If you want to
	do substitutions on non-ASCII characters, please use the regular
	search/replace features.
	
PRegEx_Interpolate(String, [VarsPList]) ==> String

	Does the pre-processing step that PRegEx_ReplaceString would do
    before it does a replace, and returns the interpolated string.

	Note: Since interpolation is usually done on short-ish
	programmer-supplied strings rather than large buffers, the
	incoming argument is a simple string, not a String Buffer (list).

    Supports all of the escape codes mentioned in the "ReplPat",
    including insertion of back-references, if any (see "escape codes"
    above for details).

	IN ADDITION to the normal interpolation, and IF the optional
    argument of VarsPList is supplied, then the sequence ${Foobar}
    inside the String will be replaced with the value of the property
    (string) "Foobar" from VarsPList, and ${#Foobar} will be replaced
    with value of the property (symbol) #Foobar.

	Properties whose values are absent or not of type "string" will
    result in an empty string being inserted.

	Example:

    set Props     = [#FirstName: "Joe"]
    set Location  = "Town: Davis County: Sacramento"

    PRegEx_Search([Location], "Town: (.*?) County:") -- sets \1 
    put re_i("\1 says \x22Welcome, ${#FirstName}!\x22", Props)
	 -- "Davis says "Welcome, Joe!""

	Note: Although not documented to behave this way, in the current
	MOA implementation, searching a property list for the property "a"
	is considered equivalent to searching for the property #a, and
	vice versa.  Consequently, Interpolate also has this behavior --
	i.e. it does not distinguish between the string and symbol forms
	of the property name.  However, if MOA ever "corrects" this
	behavior, then Interpolate will behave with the more strict
	interpretation documented above.  Just be sure to use or omit the
	"#" as documented here, and your code will be upwardly-compatible
	with future versions of MOA.  Then, if you never intermix symbol
	properties and string properties in the same property list, you
	probably will not have to worry about this subtlety.

	Note, however, that strings can contain any character(s) in any
	length, while symbols have a more limited range of legal
	characters.  However, symbols are much faster to look up in a
	large property list.

List-manipulation utility functions:
------------------------------------

	These are PRegEx-supplied variants of favorite built-in Perl
	functions.  In Perl, regular expressions and list manipulation are
	tightly coupled, so it's only natural that PRegEx should strive for
	the same.

	You'll notice that many of these functions are generally useful
	for list-manipulation, even if you don't need to do any searching,
	replacing, and extracting.


PRegEx_CopyList(ListOrPList, [Deep, InitList]) ==> CopiedListOrPList

	Returns a copy (shallow by default, deep if Deep is true) of the
	given List or Property List.

	If a memory error occurs, returns an error code instead of a list.

	Warning: Deep copying does not check for recursive list inclusion.
	If you try to Deep copy a recursive data structure, the routine
	will run for a VERY LONG TIME till memory is filled up and then
	fail with a memory error.  

	If InitList is passed, it must be the same type of list as
	ListOrPList.  If present, the items copied from ListOrPList will
	be copied into InitList.  This is a way to use CopyList to deeply
	or shallowly APPEND items from a list onto another list (or in the
	case of PLists, ADD those key/value pairs).

	Note: Assumes that all new PLists should be marked as "sorted" (so
	it does).

	Note: Deep copying only makes deep copies of elements that
	themselves are Lists or PLists.  Otherwise, any other type of
	object is shallowly copied.  (Possible future improvement: if a
	child object has a "clone" method, Deep mode could check for that
	method and try to call it to allow the object to clone itself.)

PRegEx_Grep	  (List, RE, [Opts]) ==> NewList ("Regexp mode")

	Grep produces a new list derived by filtering an existing one.

	Grep has two modes.  This is the first one.  It is triggered by
	suppling a STRING (RE) as the second argument and optional Opts as
	3rd.

	Returns a new list whose contents are the elements of List for
    which, when matched against RE/Opts, produce at least 1 match.

    Elements of the incoming List must be plain strings, or SrchStrL
    string buffers (i.e. lists containing a string and optional length
    integer).

    Elements that do not meet these requirements will simply be
    skipped.  Errors encountered in matching (e.g. failure of RE to
    compile correctly, memory errors), will cause Grep to finish
    prematurely, returning only the items that have been matched up to
    that point.  Checking LastErrCode() after calling Grep will
    indicate the error code, if any.

	Example: 

	put PRegEx_Grep([1,"abc","","fo","",["w"],"b",#symb], "\w+", "g")	
	-- ["abc", "fo", ["w"], "b"]

	Notice how 3 strings and 1 String Buffer object within the list
	were successfully matched by Grep.  Some integers, non-matching
	empty strings, and a symbol, did not match and so did not appear
	in the returned list.

PRegEx_Grep	  (List, #Filter, [ArgList]) ==> NewList ("Filter mode")
					
	Grep produces a new list derived by filtering an existing one.

	Grep has two modes.  This is the second one.  It is triggered by
	supplying a symbol (#Filter) as the second argument.

	Filters list according to the boolean results returned by the
    "#Filter" function, which can be your own custom handler or any
    Lingo built-in function whose results can be interpreted as
    Boolean (e.g. #symbolP, #stringP, #integerP, #length).

	Returns a new list whose contents are the elements of List for
    which, when passed to #Filter with optional additional arguments
    from ArgList as described above, #Filter returns true.  

	In this "Filter" mode, Grep is similar to Map or ReplaceExec in
    its recognition of any CallbackAbort/Stop/etc. flags set by the
    #Filter callback function.

	Example:

	put PRegEx_Grep([1,"abc","","fo","",["w"],"b",#symb], #length)	
	-- ["abc", "fo", "b"]

	Notice how only items for which the Lingo built-in "length"
	function returned a non-zero number, were selected, so any empty
	strings also any non-string objects were removed.


PRegEx_Map	  (List, #MapFunction, [ArgList]) ==> MappedList

	Map takes one list and makes another list where (generally) each
	item in the new list corresponds to an item in the original list.

	It uses a #MapFunction to convert an original item into its
	counterpart in the new list.

	Calls #MapFunction on each element in List.  On each call, first
	argument to #MapFunction is the element being processed.

	Subsequent arguments to #MapFunction are derived from the optional
	ArgList parameter in the manner described earlier.

	#MapFunction should be prepared to convert its first argument into
	the desired output value (of any type), using its additional
	arguments in whatever way needed.

	MapFunction may use PRegEx_CallbackAbort, Stop, etc. to affect the
	behavior of PRegEx_Map.  

	Abort: stop and discard any work done so far; delete
	partially-built result list and return empty list instead.  Set
	LastErrorCode to indicate that an Abort was requested.

	Stop: stop and return only elements successfully mapped prior to
	this point; ignore current return value of #MapFunction.

	Last: keeps this current return value but then stops and
	successfully returns the list created up to that point.

	Skip: skips adding a value for the current invocation, but
	continues to process others.  Clever use of "Skip" allows Map to
	do conversion and filtering (similar to Grep's filtering) at the
	same time -- it can "Skip" items that should not make their way
	into the new list, while mapping the items that should.


PRegEx_Sort	  (List, DeepCopy, #SortFunction, [ArgList]) ==> NewList

	Returns a new list consisting of a shallow OR Deep copy of the old
	list, sorted according to the ordering implied by #SortFunction,
	which takes as arguments two values (of any type), here dubbed A
	and B, from the list to be compared, plus optional additional
	arguments if required.  

	For any pair of items, #SortFunction must return -1 if A is less
	than B, 0 if A == B, and 1 if A > B.

	Sort does NOT modify the original list in any way, as Lingo's
	"sort" function does.  Rather, it makes a sorted copy which you
	may, at your option, choose to use in place of the original.


PRegEx_Reverse (List, [DeepCopy, InitList])	==> Reversed copy
				
	Returns a copy (shallow or deep -- default is shallow) of List
    whose elements are in the reverse order of what they were in List.

	If InitList is supplied, then reversed list is appended onto it.


PRegEx_Join	  (List, [DelimiterString]) ==> String

	Returns a string which is a concatenation of all strings in List,
    with the optional DelimiterString between each pair (it's the
    opposite of PRegEx_Split -- it rejoins a list of strings into a
    single string).
	
	Delimiter string may be empty, which is the default.

	Example:

	put PRegEx_Join(PRegEx_Split(["a,b,c,d,e"], ",", "g"), ":")
	-- "a:b:c:d:e"


PRegEx_Keys    (PList, [InitList]) ==> KeyList
PRegEx_Values  (PList, [InitList]) ==> ValueList

	Create a list of the keys (properties) or values in PList and
	either returns them in a new list or appends them to the optional
	InitList (a regular list), if provided.  

	These functions do NOT attempt to change the sorting behavior of
	the incoming PList; each returns keys or values in the order that
	MOA yields them, and, if Keys and Values are called without the
	list being altered, then the items yielded by each should
	correspond.  If the PList is modified between calls to Keys and
	Values, then no correspondence is guaranteed, or even likely.

	To get all the keys and values intermixed together pairwise in a
	single list, use PRegEx_PListToList, described below.

	Examples:

	put PRegEx_Keys  ([#a:10,#b:11,#c:12], ["dog", "cow"])
	-- ["dog", "cow", #a, #b, #c]

	put PRegEx_Values([#a:10,#b:11,#c:12], ["dog", "cow"])
	-- ["dog", "cow", 10, 11, 12]


PRegEx_GetSlice(List, Keys, [InitList]) ==> SliceList

	Given a List (regular OR PList) and a list of (item numbers /
	keys), which are said to define a "slice" of the first list,
	creates a new regular list of values corresponding to those
	specified by the "slice", and either appends the resulting list of
	values to optional InitList or returns it as a new List.

	Examples:

	put PRegEx_GetSlice([#a   ,#b   ,#c   ], [3,  2])
	 -- [#c,#b]
	put PRegEx_GetSlice([#a:10,#b:11,#c:12], [#b,#a])
	 -- [11,10]


PRegEx_SetSlice(List, Keys, Values) ==> List

	Given a List (regular or PList) and a list of (item numbers or
	keys), which are said to define a "slice" of the list, plus a
	third list of values corresponding to the keys, sets the
	keys/values accordingly in the incoming List, MODIFYING THE LIST.

	For convenience, also returns the same List/PList that was
	modified, allowing you to start with a list specified directly in
	Lingo, including an empty one, if you need.
	
	If the incoming List was a PList, SetSlice will mark it "Sorted".

	Calling SetSlice with an empty PList [:] is a way to convert a
	list a keys and a corresponding list of values into a an SPList.

	Calling SetSlice with an existing PList is a way to add all the
	keys and values from one property list into another.

	Note that any list positions that are modified by SetSlice will
	have their existing values REPLACED (like SetAt and SetAProp would
	do).

	Examples:

	put PRegEx_SetSlice([#a:1], [#d, #c, #b], [2, 3, 4])
	-- [#a:1, #b:4, #c:3, #d:2]

	put PRegEx_SetSlice([#a, #b], [2, 4, 3], ["dog", "cat", "cow"])
	-- [#a, "dog", "cow", "cat"]


PRegEx_PListToList       (PList, [InitList])	 ==> List
PRegEx_PListToListStrings(PList, [InitList])	 ==> List

	"Flattens" PList into a regular list: [key, value, key, value....]

	PRegEx_PListToListStrings does the same, but converts any keys of
    type "symbol" into strings before adding them to the new List.

	Either a new list is created, or items are appended to optional
	InitList, if provided.  

	Examples:

	put PRegEx_PListToList([#a: 2, #b: 4])
	-- [#a, 2, #b, 4]

	put PRegEx_PListToList([#a: 2, #b: 4], ["dog", "cat"])
	-- ["dog", "cat", #a, 2, #b, 4]

	put PRegEx_PListToListStrings([#a: 2, #b: 4, 1: 3])
	-- ["a", 2, "b", 4, 1, 3]


PRegEx_ListToSPList      (List,  [InitPList]) ==> SPList
PRegEx_ListToSPListSym   (List,  [InitPList]) ==> SPList

	"Unflattens" List into a sorted PList, taking elements pairwise
	from List. Any odd key left over at the end gets a void value.

	PRegEx_ListToSPListSym does the same, but converts any string keys
	to symbols before adding to the PList.  Other types of keys are
	left unaltered.

	As with other PRegEx functions that create symbols, the symbol
	created is subject to Lingo's rules governing symbols.  Attempt to
	create invalid symbols at your own risk: MOA's default behavior
	will govern.

	Either a new SP list is created, or items are appended to optional
	InitPList, if provided.  In either case, the resulting list will
	be marked as "sort"ed.

	Examples:

	put PRegEx_ListToPList([#a, 2, #b, 4])
	-- [#a: 2, #b: 4])

	put PRegEx_ListToPListSym(["a", "dog", "b", "box", #c, 2])
	-- [#a: "dog", #b: "box", #c: 2])


General utility functions: 
--------------------------

PRegEx_ReadFileToString  (FilePath, TextEncoding)  ==> StringBufferList
PRegEx_WriteStringToFile (FilePath, StringBufferList, TextEncoding) ==> 1/-Err

	ReadFileToString and WriteStringToFile create and accept
	StringBufferList (SrchStrL) objects -- that is, a list containing
	a required string buffer in item 1, and an optional data length
	field in item 2.

	Reading:

	ReadFileToString reads an entire file whose path is specified as a
    MOA-style FilePath and resolved according to Director's documented
    pathname-resolution algorithm (including obeying the canonical
    "@:" syntax), and returns a StringBufferList.

	Conveniently, the StringBufferList may be used as a
	PRegEx-compatible SrchStrL argument, allowing the file buffer to
	be immediately searched and/or manipulated by PRegEx's
	search/replace routines.

	Writing: 

	WriteStringToFile takes a StringBufferList and saves to a file.

	The FilePath may be relative or absolute, and may use any of the
	standard Director path name conventions, but it MUST contain at
	least one directory component.  If it does not, a "directory not
	found" error will occur.  WriteEntireFile does NOT attempt to
	create directories; only files.

	All the characters in the StringBufferList will be written, unless
	the second element of StringBufferList is an integer which is less
	than the character length of the string (according to the Lingo
	"length" operator),, in which case that number of characters will
	be written instead.

	On success, returns # of bytes actually written (i.e. the actual
	size of the file), possibly zero.  Note: because of file encoding
	issues, this number is *not* necessarily the same as the number of
	characters written... it could be more or less than that number.

	On failure, tries to delete any created or partially-(over)written
	file, if any, and returns a negative error code.

	So: any negative return value should be interpreted as an error
	code.

	Text Encodings:

	Director 11+ uses Unicode UTF-8 encoding for all strings.
	Therefore, it is necessary to convert any data read from a file
	into UTF-8 before it can be stored in a Director String, and
	optionally to convert it back again when writing to a file.

	Your data files might or might not be stored in UTF-8 format, so
	PRegEx_ReadFileToString and PRegEx_WriteStringToFile take a second
	TextEncoding argument, which is a string giving the name of an
	encoding that should be used to read or write the file.  (When
	reading, PRegEx will convert *from* the format you specify, into
	UTF-8.  When writing, it will convert from UTF-8 *to* the format
	you specify.)
	
	PRegEx permits all of the encodings defined by the iconv library
	(listed in detail below), plus 2 additional fully bi-directional
	8-bit text encodings that were created for this project:

	MACROMANFULL (also known as:) MACFULL MACINTOSHFULL 
	CP1252FULL   (also known as:) MS-ANSI-FULL  WINDOWS-1252-FULL

	For details of the encodings, see these source files, included
	with the PRegEx distribution:

	 	pregex/project/sources/iconv_custom/MACROMANFULL.TXT
		pregex/project/sources/iconv_custom/CP1252FULL.TXT

	These encodings are called "full" and "bi-directional" because
	they can be used to read *ANY* binary file into memory, and
	although they will have a different format in memory (UTF-8), if
	written back out again with the same encoding, the identical
	binary bytes will be retained.  This should permit you to
	manipulate binary files with PRegEx if you are careful (that is,
	if, having read the binary files into strings, you never alter
	those strings to contain any characters that are not part of the
	encoding you used when reading them in).  Again, see the files
	above for details on each character in the encodings.

	KEEP IN MIND: not all encodings are 8-bit, and not all 8-bit ones
	are bi-directional.  Therefore, if you are reading binary files,
	or Mac Roman, or Windows 1252 (Windows Latin) files, we suggest
	you use one of the 8-bit bi-directional encodings listed above.

	Here is the full list of supported text encodings.  For details on
	each encoding, please visit the iconv web page, mentioned earlier:

	ANSI_X3.4-1968 ANSI_X3.4-1986 ASCII CP367 IBM367 ISO-IR-6 ISO646-US ISO_646.IRV:1991 US US-ASCII CSASCII
	UTF-8
	ISO-10646-UCS-2 UCS-2 CSUNICODE
	UCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11
	UCS-2LE UNICODELITTLE
	ISO-10646-UCS-4 UCS-4 CSUCS4
	UCS-4BE
	UCS-4LE
	UTF-16
	UTF-16BE
	UTF-16LE
	UTF-32
	UTF-32BE
	UTF-32LE
	UNICODE-1-1-UTF-7 UTF-7 CSUNICODE11UTF7
	UCS-2-INTERNAL
	UCS-2-SWAPPED
	UCS-4-INTERNAL
	UCS-4-SWAPPED
	C99
	JAVA
	CP819 IBM819 ISO-8859-1 ISO-IR-100 ISO8859-1 ISO_8859-1 ISO_8859-1:1987 L1 LATIN1 CSISOLATIN1
	ISO-8859-2 ISO-IR-101 ISO8859-2 ISO_8859-2 ISO_8859-2:1987 L2 LATIN2 CSISOLATIN2
	ISO-8859-3 ISO-IR-109 ISO8859-3 ISO_8859-3 ISO_8859-3:1988 L3 LATIN3 CSISOLATIN3
	ISO-8859-4 ISO-IR-110 ISO8859-4 ISO_8859-4 ISO_8859-4:1988 L4 LATIN4 CSISOLATIN4
	CYRILLIC ISO-8859-5 ISO-IR-144 ISO8859-5 ISO_8859-5 ISO_8859-5:1988 CSISOLATINCYRILLIC
	ARABIC ASMO-708 ECMA-114 ISO-8859-6 ISO-IR-127 ISO8859-6 ISO_8859-6 ISO_8859-6:1987 CSISOLATINARABIC
	ECMA-118 ELOT_928 GREEK GREEK8 ISO-8859-7 ISO-IR-126 ISO8859-7 ISO_8859-7 ISO_8859-7:1987 ISO_8859-7:2003 CSISOLATINGREEK
	HEBREW ISO-8859-8 ISO-IR-138 ISO8859-8 ISO_8859-8 ISO_8859-8:1988 CSISOLATINHEBREW
	ISO-8859-9 ISO-IR-148 ISO8859-9 ISO_8859-9 ISO_8859-9:1989 L5 LATIN5 CSISOLATIN5
	ISO-8859-10 ISO-IR-157 ISO8859-10 ISO_8859-10 ISO_8859-10:1992 L6 LATIN6 CSISOLATIN6
	ISO-8859-11 ISO8859-11 ISO_8859-11
	ISO-8859-13 ISO-IR-179 ISO8859-13 ISO_8859-13 L7 LATIN7
	ISO-8859-14 ISO-CELTIC ISO-IR-199 ISO8859-14 ISO_8859-14 ISO_8859-14:1998 L8 LATIN8
	ISO-8859-15 ISO-IR-203 ISO8859-15 ISO_8859-15 ISO_8859-15:1998 LATIN-9
	ISO-8859-16 ISO-IR-226 ISO8859-16 ISO_8859-16 ISO_8859-16:2001 L10 LATIN10
	KOI8-R CSKOI8R
	KOI8-U
	KOI8-RU
	CP1250 MS-EE WINDOWS-1250
	CP1251 MS-CYRL WINDOWS-1251
	CP1252 MS-ANSI WINDOWS-1252
	CP1253 MS-GREEK WINDOWS-1253
	CP1254 MS-TURK WINDOWS-1254
	CP1255 MS-HEBR WINDOWS-1255
	CP1256 MS-ARAB WINDOWS-1256
	CP1257 WINBALTRIM WINDOWS-1257
	CP1258 WINDOWS-1258
	850 CP850 IBM850 CSPC850MULTILINGUAL
	862 CP862 IBM862 CSPC862LATINHEBREW
	866 CP866 IBM866 CSIBM866
	MAC MACINTOSH MACROMAN CSMACINTOSH
	MACCENTRALEUROPE
	MACICELAND
	MACCROATIAN
	MACROMANIA
	MACCYRILLIC
	MACUKRAINE
	MACGREEK
	MACTURKISH
	MACHEBREW
	MACARABIC
	MACTHAI
	HP-ROMAN8 R8 ROMAN8 CSHPROMAN8
	NEXTSTEP
	ARMSCII-8
	GEORGIAN-ACADEMY
	GEORGIAN-PS
	KOI8-T
	CP154 CYRILLIC-ASIAN PT154 PTCP154 CSPTCP154
	KZ-1048 RK1048 STRK1048-2002 CSKZ1048
	MULELAO-1
	CP1133 IBM-CP1133
	ISO-IR-166 TIS-620 TIS620 TIS620-0 TIS620.2529-1 TIS620.2533-0 TIS620.2533-1
	CP874 WINDOWS-874
	VISCII VISCII1.1-1 CSVISCII
	TCVN TCVN-5712 TCVN5712-1 TCVN5712-1:1993
	ISO-IR-14 ISO646-JP JIS_C6220-1969-RO JP CSISO14JISC6220RO
	JISX0201-1976 JIS_X0201 X0201 CSHALFWIDTHKATAKANA
	ISO-IR-87 JIS0208 JIS_C6226-1983 JIS_X0208 JIS_X0208-1983 JIS_X0208-1990 X0208 CSISO87JISX0208
	ISO-IR-159 JIS_X0212 JIS_X0212-1990 JIS_X0212.1990-0 X0212 CSISO159JISX02121990
	CN GB_1988-80 ISO-IR-57 ISO646-CN CSISO57GB1988
	CHINESE GB_2312-80 ISO-IR-58 CSISO58GB231280
	CN-GB-ISOIR165 ISO-IR-165
	ISO-IR-149 KOREAN KSC_5601 KS_C_5601-1987 KS_C_5601-1989 CSKSC56011987
	EUC-JP EUCJP EXTENDED_UNIX_CODE_PACKED_FORMAT_FOR_JAPANESE CSEUCPKDFMTJAPANESE
	MS_KANJI SHIFT-JIS SHIFT_JIS SJIS CSSHIFTJIS
	CP932
	ISO-2022-JP CSISO2022JP
	ISO-2022-JP-1
	ISO-2022-JP-2 CSISO2022JP2
	CN-GB EUC-CN EUCCN GB2312 CSGB2312
	GBK
	CP936 MS936 WINDOWS-936
	GB18030
	ISO-2022-CN CSISO2022CN
	ISO-2022-CN-EXT
	HZ HZ-GB-2312
	EUC-TW EUCTW CSEUCTW
	BIG-5 BIG-FIVE BIG5 BIGFIVE CN-BIG5 CSBIG5
	CP950
	BIG5-HKSCS:1999
	BIG5-HKSCS:2001
	BIG5-HKSCS BIG5-HKSCS:2004 BIG5HKSCS
	EUC-KR EUCKR CSEUCKR
	CP949 UHC
	CP1361 JOHAB
	ISO-2022-KR CSISO2022KR
	CP856
	CP922
	CP943
	CP1046
	CP1124
	CP1129
	CP1161 IBM-1161 IBM1161 CSIBM1161
	CP1162 IBM-1162 IBM1162 CSIBM1162
	CP1163 IBM-1163 IBM1163 CSIBM1163
	DEC-KANJI
	DEC-HANYU
	437 CP437 IBM437 CSPC8CODEPAGE437
	CP737
	CP775 IBM775 CSPC775BALTIC
	852 CP852 IBM852 CSPCP852
	CP853
	855 CP855 IBM855 CSIBM855
	857 CP857 IBM857 CSIBM857
	CP858
	860 CP860 IBM860 CSIBM860
	861 CP-IS CP861 IBM861 CSIBM861
	863 CP863 IBM863 CSIBM863
	CP864 IBM864 CSIBM864
	865 CP865 IBM865 CSIBM865
	869 CP-GR CP869 IBM869 CSIBM869
	CP1125
	EUC-JISX0213
	SHIFT_JISX0213
	ISO-2022-JP-3
	BIG5-2003
	ISO-IR-230 TDS565
	ATARI ATARIST
	RISCOS-LATIN1

	Warnings about text encodings:

	In general, you should be certain that your files are in the
	correct encoding.  Invalid character codes will be ignored/omitted
	when your file is read in, and/or file conversion will stop at the
	first invalid code encountered.  (As an example, please note that
	ISO-8859 has several unmapped code points -- you may wish to use
	CP1252FULL instead -- see above.)

	"UTF-8" encoding:

	Note that if you use the "UTF-8" encoding, the file reading will
	STOP at the first invalid character found, and the string will
	appear to be truncated.

	"raw" enccoding:
	
	PRegEx also defines a "raw" encoding that permits a binary file to
	be read directly into a Director string.  The data *really should*
	be in UTF-8 format, but the format will *not* be verified upon
	reading.  So, any attempt to use that string (print it, modify it,
	search or replace within it, view it in the debugger window, put
	it to the message window, etc.), could result in Director crashing
	or other unpredictable behavior, because all of the code paths
	mention above (and maybe others) will be assuming that the string
	contains valid UTF-8 characters.  Therefore, this mode should not
	be used, or at the least, should be used only by advanced users
	who are willing to accept the inherent risks.  If you read a
	string with "raw" mode, you should only use "raw" mode when
	writing it back out again, since the same issue will occur there
	-- only "raw" mode will write a string to a file without first
	examining its characters for UTF-8 conformance.


PRegEx_ReadEntireFile  (FilePath)  ==> StringBufferList
PRegEx_WriteEntireFile (FilePath, StringBufferList) ==> 1/-Err

	These functions are *deprecated*.  Please do not use
	them in new code, and please take them out of any existing projects
	(including their aliases, re_read and re_write).  Please see the
	documentation for ReadFileToString and WriteStringToFile, above,
	for an explanation of why they are deprecated.

	For backward compatibility with the prior version of PRegEx, these
	have been redefined to do roughly the same as calling
	ReadFileToString and WriteStringToFile, described immediately
	above, but with with "MACROMANFULL" or "CP1252FULL" filled in for
	you as the TextEncoding, depending on which platform you are
	using:

	Mac:

	PRegEx_ReadFileToString (FilePath, "MACROMANFULL")
	PRegEx_WriteStringToFile(FilePath, StringBufferList, "MACROMANFULL")

	Windows:

	PRegEx_ReadFileToString (FilePath, "CP1252FULL")
	PRegEx_WriteStringToFile(FilePath, StringBufferList, "CP1252FULL")

	"MACROMANFULL" and "CP1252FULL" were chosen as the default
	encodings because they are (as described earlier):

	 - full 8-bit (support all 256 possible binary bytes and no more)

     - bi-directional (data read in then written back out using same
       encoding will be unaltered on disk, as long as no characters
       not present in the encoding are added in the meanwhile)

	 - compatible with Director 7-10 behavior, where 8-bit characters
       read into strings were simply interpreted as being MacRoman on
       the Mac, and Windows Latin 1 (aka Windows 1252 aka CP 1252) on
       Windows.

	If the files you are reading and writing will only ever contain
	7-bit ASCII characters, then there is no harm in continuing to use
	these functions, however you should still switch to the newer
	versions of these functions so that if your future needs change,
	your Lingo code will reflect the need to expressly choose a text
	encoding when reading and writing files.

	Similarly, if the strings your old projects are reading and
	writing from/to files do not rely on specific encodings of
	non-ASCII characters, there should be no harm in using these
	deprecated functions since the non-ASCII characters should still
	at

Callback-related functions:
---------------------------

	PRegEx's internal callback mechanism is so flexible that we decided
	to expose it in this API so Lingo functions can be created that
	can elegantly make callbacks to other Lingo functions, something
	that is essentially impossible to do using regular Lingo.


PRegEx_CallHandler   (#CallbackFunction, [ArgList1, ArgList2])

	Calls any function by symbol name.  ArgList1 and ArgList2 are both
    optional.  Together they are flattened to produce a single
    argument list for the callback function.  

	In other words, each ArgList is separately treated this way:

	If not a list (i.e. any other kind of value, even "void"), the
    value itself becomes an argument to the #CallbackFunction. If a
    list, it is shallowly flattened and its elements become arguments
    to the #CallbackFunction, in the order they appear in the list.

    Note: if what you really want is to pass the actual list object
    itself and be sure it does not get flattened, just be sure to put
    the list you want to pass inside another temporary list, like
    this:

    PRegEx_Callback(#MyFunction, [myList1 ,  myList2]) or this:
    PRegEx_Callback(#MyFunction, [myList1], [myList2]) -- equivalent

	... where [] is the Lingo list-construction operator, of course.

    Why have two optional arg lists?  Because you may wish to use this
    function when implementing a callback feature in a Lingo handler
    that you're designing.

    Just as some of the PRegEx callback-oriented functions do, you
    might use ArgList1 for the arguments YOU are supplying to the
    callback function, if any, and pass through ArgList2 for the
    arguments YOUR CALLER is supplying to the callback function, if
    any.

	This is how all the other PRegEx_ functions that take callbacks
    also behave (they all use CallHandler internally, in fact).  You
    don't have to do it this way, but this is a logical and clean way
    to implement any routine that offers to make calls to a callback
    function.

	Note: You may wish to allow the CallbackFunction to call
	PRegEx_CallbackAbort etc. to set those flags while running.  If you
	do allow this, then it is your responsibility to check those flags
	and then to reset them to zero each time after calling
	PRegEx_CallHandler.  Otherwise, those flags may persist and
	incorrectly affect another routine in your call stack.  If there
	is any chance at all that the callback function will set these,
	then be sure to re-set them to zero after it returns.
	
	PRegEx transparently takes care of saving and restoring settings of
	the callback control flags in stack frames below yours, so you
	never have to worry that setting these flags might inadvertently
	interrupt their use in a lower stack frame, if any.


PRegEx_CallbackAbort([bool]) ==> Stop operation and fail with error
PRegEx_CallbackStop ([bool]) ==> Stop before this iteration, but succeed
PRegEx_CallbackLast ([bool]) ==> Stop after this iteration, but succeed
PRegEx_CallbackSkip ([bool]) ==> Skip this iteration, but continue

	These flags may be set by any callback function that wishes to
	send a signal to its caller.  The caller may either be a built-in
	PRegEx routine OR, a Lingo-authored routine that called the
	function using the PRegEx_CallHandler utility routine.

	These flags should NOT be set by any function that doesn't believe
	it is currently being called as a callback by some PRegEx function.

	As an extended example, consider how these may be called from
	within a ReplFunction to set a flag that tells the ReplaceExec
	function to end its loop after the next time the ReplFunction
	returns.  

	Each one would cause ReplaceExec to terminate slightly
	differently.

	CallbackLast says that the current replacement should be done, but
	then it will be the last one (do not keep searching), terminating
	the replacement successfully (including keeping any replacements
	up to this point).

	CallbackStop says to NOT do the current replacement (ignoring the
	return value of the ReplFunction), and terminate the replacement
	successfully (including keeping any replacements up to this
	point).

	CallbackAbort is the same as ReplaceStop, but "aborts", causing
	CallbackExec to leave the search string untouched, not set any back
	refs, and set FoundCount to zero, much as if the very first search
	had simply not succeeded in the first place.

	Stopping using CallbackLast or CallbackStop could be useful if
	replacement should stop once a certain token is reached in the
	input.

	Aborting could be useful if there is a memory failure or other
	serious failure encountered by the callback function and it needs
	to gracefully abort any further potentially memory-consuming
	activity.
	
	CallbackSkip could be useful if a particular item should be
	ignored/untouched/omitted/left unchanged, but you want your
	calling function to continue with whatever loop it is currently
	processing.


Error code constants:
---------------------
	Each of these "constant" functions returns the corresponding
	numeric PRegEx error code.  This can be helpful if you want to
	write code that checks for these specific error cases, either with
	functions that return error codes directly, or for those that
	merely set PRegEx_LastErrCode.

PRegEx_ErrCode_OutOfMemory()
PRegEx_ErrCode_SearchStrLMustBeList()
PRegEx_ErrCode_SearchStrLMustContainString()
PRegEx_ErrCode_SearchStrLLengthArgMustBeInteger()
PRegEx_ErrCode_REMustNotBeEmpty()
PRegEx_ErrCode_REDidNotCompile()
PRegEx_ErrCode_ReplPatMustBeString()
PRegEx_ErrCode_CallbackFuncMustBeSymbol()
PRegEx_ErrCode_CallbackFuncDidNotReturnString()
PRegEx_ErrCode_QuoteMetaNeedsString()
PRegEx_ErrCode_TriedToMatchWithoutSearchStrL()
PRegEx_ErrCode_TriedToMatchWithoutSearchPattern()
PRegEx_ErrCode_TriedToReplaceWithoutMatching()
PRegEx_ErrCode_CallbackRequestedAbort()
PRegEx_ErrCode_UnexpectedMOAError()
PRegEx_ErrCode_UnexpectedInternalError()
PRegEx_ErrCode_CallbackFunctionNotFound()
PRegEx_ErrCode_ExpectedListArgument()
PRegEx_ErrCode_ExpectedPListArgument()
PRegEx_ErrCode_GrepNeedsFunctionNameOrPRegEx()
PRegEx_ErrCode_ExpectedStringArgument()
PRegEx_ErrCode_SortFunctionDidNotReturnInteger()
PRegEx_ErrCode_FileNotFound()
PRegEx_ErrCode_ErrorOpeningFile()
PRegEx_ErrCode_ErrorReadingFile()
PRegEx_ErrCode_ErrorWritingFile()

	Example:

	put PRegEx_DescribeError(PRegEx_ErrCode_SearchStrLMustBeList())
	-- "PRegEx: SearchStrL argument must be a Lingo list."


==================================================================
Help! What is a Regular Expression?  What's going on here?
==================================================================

[ASIDE TO NEWBIES: If you don't already know what regular expressions
are and are now burning with desire to use them, then you are facing a
pretty steep, but immensely gratifying, learning curve. Hang in there!
It's worth the effort to learn!]

This is a very brief intro.  Don't expect much.  Try Google.

Regular Expression = Search String or Pattern

That's all there is to it.

Longer explanation: A Regular Expression (or RE or regex or regexp) is
a search specification that can contain special syntax (think:
wildcard characters on steroids) that allows you to perform extremely
complex search, search/replace, or extraction operations on text
buffers of any size.

Examples: 

dog                  -- matches just these letters
(dog)|(cat)          -- matches the letters "dog" or "cat"
organi[sz]e          -- matches US or British spelling of "organize"
^\w{1,8}.\w{1,3}$    -- matches any DOS 8.3-style file name

In addition to many dozens of special syntax characters like the ones
hinted at above, some special "escape" sequences, triggered by a
backslash, are also recognized within the RE pattern.  

	\n matches a return char (same as Lingo "return" or char(13))
	\t matches a tab char

(There are many others -- see definition of all "escape codes" earlier
in this file.  See also the documentation for the PCRE project.)

Backreferences, written as \#, such as \1, \2 ... \99, mean "match (or
insert when replacing) the parenthesized expression number N in this
spot".

Backreference example A:

"((Chris)|(Ravi)).*?\1" 

... finds the name "Chris" or "Ravi" in a string, provided it is also
followed again some distance later by the same name again.

Backreference example B:

"(<(\w+)(.*?)>)(.*?)(</\2.*?>)" 

... Matches most pairs of balanced HTML/XML tags, such as: <P>....</P>
or <B>...</B> or <A HREF=foo.html>Home</A>.

In this last example, the backreference substrings would be assigned
(and individually retrievable!) as follows:

Backreference 1: "<A HREF=foo.html>"
Backreference 2: "A"
Backreference 3: " HREF=foo.html"
Backreference 4: "Home"
Backreference 5: "</A>"

Backreferences can be used to extract pieces of data from a string
when searching, and, equally importantly, can be used in a Replacement
pattern when doing a search/replace, so you can insert part or parts
of the matched expression directly into the replaced string.

HOW TO LEARN REGULAR EXPRESSION SYNTAX: 

1) There are whole BOOKS written about regular expression syntax and
   its subtleties.  We are not going to try to teach you anything more
   about them in this document.  Buy one of those books now, if you
   are interested.  http://amazon.com/.

2) Another good way to get started: ask a friend for help and
   pointers.  (Preferably you'll be asking someone other than Chris or
   Ravi :-)).

3) The PCRE documentation, included with this Xtra and on the Web,
   gives a thorough, possibly overly-technical, overview of the
   precise features of the regular expression language supported by
   it, and consequently supported by PRegEx.  (To get the most out of
   it: ignore all the deeper technical stuff; just read about the
   syntax.)  http://pcre.org/man.html	

4) Also, if you have access to perl, be sure to read the "perlre"
   manual page that comes with every perl distribution.  99% of the
   syntax documented there applies here.

5) Practice, practice, practice.  Have a copy of Director open while
   learning.  Try every example in the message window.  Try to make a
   test case for every different feature or behavior your learn about,
   and test it right then and there.  Read and understand the test
   cases in the test movie that accompanies the Xtra.


TWO NOTES FOR PERL USERS ONLY

Note 1: Surrounding the RE with forward slashes is NOT NECESSARY.  In
Perl, the slashes are string delimiters, much like quote marks, and
are not part of the search pattern itself.

Note 2: $-sign and @-sign interpolation are not normally performed by
any of the functions that process the other backslashed escape codes,
as those are features of Perl's built-in string interpolation, not
features of regular expressions per se.  If you need to build up a
replacement pattern string out of pieces, just use normal Lingo & and
&& or other means of concatenation, such as PRegEx_Join. OR, read above
about PRegEx_Interpolate, which does all the usual interpolation
functions, plus can optionally look up values from a property list and
interpolate them into a string, similar to Perl's $-sign interpolation
feature.  Note that if you plan to search using a RE that has had
user-supplied data interpolated into it, you almost certainly need to
call QuoteMeta either on the user-supplied parts before they are
interpolated, or on the interpolated whole, depending on what you can
assume about the data.
	

=========================================
Additional Examples
=========================================

Searching and/or Extracting
---------------------------

==> Search for a string

set FoundCount = max(PRegEx_Search(foo, "(abc+)", ""), 0)

==> Search a string and then extract backrefs by number

if (PRegEx_Search(foo, "(abc+)([,;])", "") > 0) then 
  set ABC =   PRegEx_GetMatchString(1)
  set Punct = PRegEx_GetMatchString(2)
end if
set FoundCount    = PRegEx_FoundCount()

==> Search a string, extracting matching subexpressions into a list or
sorted property list

set NRs = PRegEx_ExtractIntoList   (foo, "Name: (.*?) Rank: (.*?)", "")
set NRs = PRegEx_ExtractIntoSPList (foo, "Name: (.*?) Rank: (.*?)", "")
set FoundCount    = PRegEx_FoundCount()

==> Same, but "globally" -- repeating the search till the end of the
string, extracting _all_ backreferences along the way into a Lingo
list or sorted property list

set NRs = PRegEx_ExtractIntoList   (foo, "Name: (.*?) Rank: (.*?)", "g")
set NRs = PRegEx_ExtractIntoSPList (foo, "Name: (.*?) Rank: (.*?)", "g")
set FoundCount    = PRegEx_FoundCount()

==> Search "globally", but in a while() loop, being able to execute
code upon each match.

PRegEx_SearchBegin (foo, "Date: (\S+)", "g")
repeat while (PRegEx_SearchContinue () > 0)
  put PRegEx_GetMatchString(1)
end repeat
set FoundCount = PRegEx_FoundCount()


Searching and Replacing
-----------------------

==> Search and replace with a simple string

set FoundCount = max(PRegEx_Replace(foo, "(abc+)", "i", "ABC"), 0)

==> Search and replace with a string with escape codes for back references

set FoundCount = max(PRegEx_Replace(foo, "(abc+)", "i", "### \1 ###"), 0)

==> "Global" flag -- i.e. replace one vs. replace all.

set FoundCount = max(PRegEx_Replace(foo, "(abc+)", "ig", "ABC"), 0)

==> Search functions also extract backrefs, like matching functions.
So you can retrieve an item at the same time you delete or modify it:

if (PRegEx_Replace(foo, "(abc+)", "", "") > 0) then 
  set ABC = PRegEx_GetMatchString(1)
end if
set ItemsReplaced  = PRegEx_FoundCount()

==> Search and replace, but a function gets called to perform each
replacement

on NameCnv nameLookup
  return("Name:" && nameLookup[PRegEx_GetMatchString(1)]
end NameCnv

PRegEx_ReplaceExec(foo, "Name: (\S+)", "ig", #NameCnv, [nameLookup])
set ChangeCount = PRegEx_FoundCount();
