Re: Scripting Mail.app on 10.3.2?

2004-01-22 Thread John Delacour
At 5:14 pm -0600 21/1/04, Ken Williams wrote:

I think IPC::Run can do this nicely, but I haven't exactly tested it:

use IPC::Run qw(run);
run('osascript', \$ass, \$asresult);
Can you provide a real working example.  I find the Synopsis of 
IPC::Run a little obscure.

Thanks.

JD




Re: Need help with a string parsing problem

2004-01-22 Thread Andy Turner
On Wed, Jan 21, 2004 at 10:39:09PM -0600, Ken Williams wrote:
> This is probably because 5.6 expands the whole for(...) list in 
> advance, but 5.8 evaluates it lazily.

Ah, but I'm using stock Perl on Panther, which is 5.8.1-RC3.  It also
happens on my Debian Unstable box which is running 5.8.2.

> In any case, it's always a little risky to use $1 and friends more than 
> 1 statement after the regex they come from.  Too many things clobber 
> 'em at a distance.

Yeah, that's my feeling.  And if the code ever becomes more complex then
their meaning can become pretty obscure.

-- 
Andy <[EMAIL PROTECTED]> - http://anime.mikomi.org/ - Community Anime Reviews 
  Good men, if such there be, would either remain true to their political
  faith and lose their economic support, or they would cling to their
  economic master and be utterly unable to do the slightest good. The
  political arena leaves one no alternative, one must either be a dunce or
  a rogue.-- Emma Goldman


Re: Scripting Mail.app on 10.3.2?

2004-01-22 Thread Paul McCann
Hi John,
you asked...

> Can you provide a real working example.  I find the Synopsis of 
> IPC::Run a little obscure.

It's expecting an arrayref in the first slot: here's a trivial,
but working, example.

---
#!/usr/bin/perl
use strict;
use IPC::Run qw(run);
my (@osa,$ass,$asresult);
@osa=qw(osascript);
$ass=<

tricky parsing question

2004-01-22 Thread wren argetlahm
I'm working on a linguistic module and I'm trying to
find a good way to split a string up into "segments".
I can't assume single charecter strings and want to
assume maximal segments. As an example, the word
"church" would be rendered as the list ('ch', 'u',
'r', 'ch') and wouldn't break the "ch" up smaller even
though both "c" and "h" are valid segments in English.
I have all the valid segments for a given language
stored as keys in a hash, now I just need an algorithm
to chop up a string into a list. Any ideas?

~wren

__
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/


Re: tricky parsing question

2004-01-22 Thread Rick Measham
On 23 Jan 2004, at 01:21 pm, wren argetlahm wrote:

I'm working on a linguistic module and I'm trying to
find a good way to split a string up into "segments".
I can't assume single charecter strings and want to
assume maximal segments. As an example, the word
"church" would be rendered as the list ('ch', 'u',
'r', 'ch') and wouldn't break the "ch" up smaller even
though both "c" and "h" are valid segments in English.
I have all the valid segments for a given language
stored as keys in a hash, now I just need an algorithm
to chop up a string into a list. Any ideas?
Wren, when you say 'segments' it appears you mean phonemes or phonetics.

CPAN has several modules that may help you:

Lingua::Phoneme uses the Moby Pronounciation Dictionery to find the 
phonemes.

Text::Metaphone also deals with phonemes and will return 'Church' as 
'XRX' meaning 'ch', 'r', 'ch'. Unfortunately it returns the 'ch' in 
'Character' as an 'X' also.

And that, of course, is the most difficult part. English is such a 
hodge-podge of hacks from other languages the understanding it via 
algorithms is very very hard.

Cheers!
Rick


Rick Measham
Senior Designer and Developer
Printaform Pty Ltd
Tel: (03) 9850 3255
Fax: (03) 9850 3277
http://www.printaform.com.au
http://www.printsupply.com.au
vcard: http://www.printaform.com.au/staff/rickm.vcf


Re: tricky parsing question

2004-01-22 Thread Chris Devers
On Thu, 22 Jan 2004, wren argetlahm wrote:

> I'm working on a linguistic module and I'm trying to
> find a good way to split a string up into "segments".

Your definition os "segment" here is vague; is it safe to ignore that and
just accept that a canonical list of each language's 'segments' is a
static thing that is already stored as hash keys? 

And for that matter, why a hash? Are you associating values of some kind
with each segment key? If you're not, this could be easier to solve with
plain arrays, since the list of elements can be manually determined:

  @english_segs = qw[ ch sh th ... x y z ];

Or whatever. This way, the common ones can be frontloaded, which may
speed things up a bit.


I bet there's a clever solution to a problem like this in the Perl
Algorithms ("Wolf") book, but I'd have to poke around to find it -- it's
probably presented as the solution to a different problem. 

At a guess, I think you want a loop based on the length of the longest
pre-determined element. Hence, if the longest element is three or four
letters (maybe you count the 'ion' in words like 'traction', or the 'ious'
in words like 'serious' [1]), then you can look at the string in chunks of
that many letters, looking for the longest possible match in your elements
list, then push back whatever is left over after you make a match and
start over again with the next chunk of three or four letters. 


I think I'm starting to describe how to implement the regex engine here :/

Maybe Parse::RecDescent? Maybe I'm over-thinking this...



[1] My copy of /usr/share/dict/words has 75 words with 'ious', but only
one word with 'iou[^s]', so I'm guessing that 'ious' might be taken
as a single entity for your purposes.



-- 
Chris Devers



Re: tricky parsing question

2004-01-22 Thread wren argetlahm
--- Bill Stephenson <[EMAIL PROTECTED]> wrote:
> You need to get a book on regex's.

I know the solution lies in regex's, the problem is
that I can't quite figure out a generic enough way of
doing it. The problem is for a module and so the list
of valid segments is user defined. I guess I could do
something like:

$segs = '('. join('|', @segs) .')';
$string =~ s/^$segs//;
$first_seg = $1;

But I'd have to sort @segs somehow so that the longest
segments come first, and since alphabets can have many
many different segments, I worry about memory issues.

--- Bill Stephenson <[EMAIL PROTECTED]> wrote:
> Perl.com has the best available, "Mastering 
> Regular Expressions" is what you want.
> 
> Sounds like a formidable task though. For some
> additional help with your regex you can play 
> with a tool posted on the "perlhelp.com" web 
> site. Go to "Resources" and look for the 
> "Regular Expression Explanation Generator".

Thanks, I'll have to check those out sometime.

--- Rick Measham <[EMAIL PROTECTED]> wrote:
> Wren, when you say 'segments' it appears you 
> mean phonemes or phonetics.

Yeah, I do mean phonemes (or something like it). The
module is language independent, but I'll check those
modules out.

--- Chris Devers <[EMAIL PROTECTED]> wrote:
> Your definition of "segment" here is vague; is 
> it safe to ignore that and just accept that a 
> canonical list of each language's 'segments' is 
> a static thing that is already stored as hash 
> keys?

By "segment" I mean the smallest charecter or sequence
of charecters that has a regular pronunciation. But
yes, it's safe to ignore that and assume there's a
cannonical list of "segments" already in memory.

I am indeed associating the segments with values,
hence storing them as keys in a hash. Also, by storing
them that way, if I'm trying to find the values
associated with a given segment, I can quickly find it
by $all_segments{$segment_in_question} rather than
needing to do a for or foreach loop over an array of
an estimated 15..50 items.

The loop based off the longest element thing sounds
like a good idea, I'll see if I can get it to work.

For those who wonder what on earth I'm up to... it's
an OO module for autosegmental phonology. In short you
feed the object a string and an "alphabet" which maps
segments to values ("d" has +voicing, +dental,
-vocalic, etc) and it creates an array of hashes (or
hash of arrays) where the index is the sequence number
of the segment in the string, and where the key is the
name of the "tier" (voicing, dental, vocalic, etc).
Then there'll be ways to muck around with the object
ala phonetic rules. Then there'll be a method to tie
all of the tiers back together into a single string
(per the alphabet) and spit it back out.

__
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/