I find myself requiring an object to store a text string, with ways to throw markup or presentation attributes around it, but in such a way that they're easy to edit and change separately from the string data. I.e. the usual embedded HTML / ANSI escapes / etc... are not really suitable.
With this in mind, I have come up with String::Tagged.
I'd appreciate any thoughts or comments people might have on the API and
design before I upload it.
-----
String::Tagged(3pm) User Contributed Perl Documentation String::Tagged(3pm)
NAME
"String::Tagged" - string buffers with value tags on ranges
SYNOPSIS
use String::Tagged;
my $st = String::Tagged->new( "An important message" );
$st->apply_tag( 3, 9, bold => 1 );
$st->iter_substr_nooverlap(
sub {
my ( $substring, %tags ) = @_;
print $tags{bold} ? "<b>$substring</b>"
: $substring;
}
);
DESCRIPTION
This module implements an object class, instances of which store a
(mutable) string buffer that supports tags. A tag is a name/value pair
that applies to some non-empty range of the underlying string.
Tags may be arbitrarily overlapped. Any given offset within the string
has in effect, a set of uniquely named tags. Tags of different names
are independent. For tags of the same name, only the the latest,
shortest tag takes effect.
For example, consider a string with two tags represented here:
Here is my string with tags
|-------------------------| foo => 1
|-------| foo => 2
|---| bar => 3
Every character in this string has a tag named "foo". The value of this
tag is 2 for the words "my" and "string" and the space inbetween, and 1
elsewhere. Additionally, the words "is" and "my" and the space between
them also have the tag "bar" with a value 3.
CONSTRUCTOR
$st = String::Tagged->new( $str )
Returns a new instance of a "String::Tagged" object. It will contain no
tags. If the optional $str argument is supplied, the string buffer
will be initialised from this value.
METHODS
$str = $st->str
Returns the plain string contained within the object.
$str = $st->substr( $start, $len )
Returns a substring of the plain string contained within the object.
$st->apply_tag( $start, $len, $name, $value )
Apply the named tag value to the given range.
$st->unapply_tag( $start, $len, $name )
Unapply the named tag value from the given range. If the tag extends
beyond this range, then any partial fragment of the tag will be left in
the string.
$st->delete_tag( $start, $len, $name )
Delete the named tag within the given range. Entire tags are removed,
even if they extend beyond this range.
$st->iter_tags( $callback )
Iterate the tags stored in the string. For each tag, the CODE reference
in $callback is invoked once.
$callback->( $start, $length, $tagname, $tagvalue )
$st->iter_tags_nooverlap( $callback )
Iterate non-overlapping ranges of tags stored in the string. The CODE
reference in $callback is invoked for each range in the string where no
tags change. The entire set of tags active in that range is given to
the callback.
$callback->( $start, $length, %tags )
The callback will be invoked over the entire length of the string,
including any ranges with no tags applied.
$st->iter_substr_nooverlap( $callback )
Iterate ranges of the substring in the same way as
"iter_tags_nooverlap()", but passing the substring of data instead of
the start position and length.
$callback->( $substr, %tags )
@names = $st->tagnames
Returns the set of tag names used in the string, in no particular
order.
$tags = $st->get_tags_at( $pos )
Returns a HASH reference of all the tag values active at the given
position.
$value = $st->get_tag_at( $pos, $name )
Returns the value of the named tag at the given position, or "undef" if
the tag is not applied there.
$st->set_substr( $start, $len, $newstr )
Modifies a range of the underlying plain string to that given.
NOTE: - Because of the way that this method modifies the underlying
string, its use would disturb the tags applied in its range, or after
if the length changes. There is no clear behaviour for what should be
done to tags applied in the affected range, or after it.
Therefore, this method will throw an exception if any tags apply after
the $start index.
This is unlikely to be useful in a general application; I am well aware
of this fact. I welcome suggestions on what the behaviour(s) ought to
be.
$st->insert( $start, $newstr )
Insert the given string at the given position. A shortcut around
"set_substr()".
$st->append( $newstr )
Append to the underlying plain string. Because no tags will yet apply
here, this method is not subject to note given in "set_substr()".
$st->append_tagged( $newstr, %tags )
Append to the underlying plain string, and apply the given tags.
TODO
ยท Implement (possibly multiple) ways to modify substrings, while
keeping some level of sanity on the tags. Likely sensible behaviour
would probably be:
Tags entirely before the replaced region would remain unchanged.
Tags entirely within the replaced region would be deleted.
Tags entirely after the replaced region would get shifted up/down
the appropriate amount to ensure they still apply to the right
characters.
Tags that start before and end after the range would remain, and
have their lengths suitably adjusted.
Tags that span just the start or end of the range, but not both,
would be truncated shorter, so as to remove the part of the tag
applied on the modified section but preserving that applied
outside.
There are likely variations on these rules that could equally apply
to some uses of tagged strings. Consider whether the behaviour of
modification is chosen per-method, per-tag, or per-string.
AUTHOR
Paul Evans <[email protected]>
perl v5.10.0 2009-01-30 String::Tagged(3pm)
-----
--
Paul "LeoNerd" Evans
[email protected]
ICQ# 4135350 | Registered Linux# 179460
http://www.leonerd.org.uk/
signature.asc
Description: Digital signature
