Re: dot POS files and Corpus Linguistics

2010-04-28 Thread David Glasgow

On 28 Apr 2010, at 3:17 am, use-revolution-requ...@lists.runrev.com wrote:

 I then found out that in the case of corpus files POS means 'parts of speech'.
 This is typical academia delighting in obscurantism.
 
 Now for more 'fun':
 
 Also bundled in the corpus are .psd files which, wait for it, are NOT
 Adobe Photoshop files.
 
 PSD: Probably Something Different ???

Richmond,

I have no knowledge or advice that might help you.  Further, wrangling your 
strange corpus is of no possible use or real interest to me. 

.but somehow I'm hooked.  

There is something of the Pratchett about your posts. Please continue 
instalments on your progress. 

Good Luck,

David Glasgow



___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: dot POS files and Corpus Linguistics

2010-04-28 Thread Richmond Mathewson

 On 28/04/2010 11:33, David Glasgow wrote:

On 28 Apr 2010, at 3:17 am, use-revolution-requ...@lists.runrev.com wrote:


I then found out that in the case of corpus files POS means 'parts of speech'.
This is typical academia delighting in obscurantism.

Now for more 'fun':

Also bundled in the corpus are .psd files which, wait for it, are NOT
Adobe Photoshop files.

PSD: Probably Something Different ???

Richmond,

I have no knowledge or advice that might help you.  Further, wrangling your 
strange corpus is of no possible use or real interest to me.

.but somehow I'm hooked.

There is something of the Pratchett about your posts. Please continue 
instalments on your progress.

Good Luck,



Oh Dear! You are in trouble . . .  :)

Paying Council Tax today; standing in a queue and listening to moronic
proles wibbling on about the price of fish.

I shall ruminate (cowlike) on POS and PSD files; whether that elevates
me above the level of the proles or makes me even more moronic
than they are has yet to be seen.
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


dot POS files and Corpus Linguistics

2010-04-27 Thread Richmond Mathewson

 Well, Yippee-doo; the good folks at the University of
Oxford have sent me the files of the
York-Toronto-Helsinki Parsed Corpus of Old English Prose
(try saying that with your mouth full of cornflakes).

Jolly generous considering it is normally restricted to British
Higher Education Institutions (somehow the University of
Plovdiv, Paisii Hilendarski doesn't fit in that category).

HOWEVER; the corpus comes in .pos files whcih cheeses me
off immensely; on opening them with the redoubtable
TextWrangler they are heavily formatted in some odd fashion
suggesting some sort of meta-tagging.

The Java-based CS_2.002.74.jar, a.k.a 'CorpusSearch' doesn't run
for some funny reason on ye olde G4 (have yet to try it on the
Ubu-Box); but that doesn't really fuss me as ye olde academics
have decided the parameters of their stuff in advance and my feet
are too big for their shoes (hey; it's mixed metaphors time again).

So; I am looking to build a Runrev data-miner / chewer / masticator
/ whatever; but, until I can work out what a .pos file can be opened with
(so I can hae a keek at its formatin) the whole thing is on standby.
Once I can see what a .pos file should look like in some sort of POS-file
reader I can cobble together a suitably algorithmic sieve to make the
file look like it should inside a text field prior to 'chewin the fat'.

Google comes up with unintentionally witty results about 'point of sale'
and so forth, as well as something about Arabic linguistic corpora,
Chinese linguistic corpora and so forth (well, at least they are going
in the right direction).

Having written one of those slimy messages back, where one thanks people
fulsomely and then shoves in the 'However'; I got a we cannot comment on
other methods of accessing the corpus message. Well; at least I signed 
my name with

my second name (Richmond) otherwise I would have had what the Americans call
a 'Dear John' message . . .  :)

Any help re POS-file readers would be most welcome.

sincerely, Richmond Mathewson.
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: dot POS files and Corpus Linguistics

2010-04-27 Thread stephen barncard
Richmond, it appears that .pos files are LOTUS NOTES,  among many others

http://file-extension.net/seeker/file_extension_pos

http://filext.com/file-extension/POS

http://en.wikipedia.org/wiki/Lotus_Notes

http://www.computerfileextensions.com/file-extensions.php/POS

FILE FORMAT:
http://www.x-ways.net/winhex/POS_Format_2_0.html



On 27 April 2010 11:04, Richmond Mathewson richmondmathew...@gmail.comwrote:

  Well, Yippee-doo; the good folks at the University of
 Oxford have sent me the files of the
 York-Toronto-Helsinki Parsed Corpus of Old English Prose
 (try saying that with your mouth full of cornflakes).

 Jolly generous considering it is normally restricted to British
 Higher Education Institutions (somehow the University of
 Plovdiv, Paisii Hilendarski doesn't fit in that category).

 HOWEVER; the corpus comes in .pos files whcih cheeses me
 off immensely; on opening them with the redoubtable
 TextWrangler they are heavily formatted in some odd fashion
 suggesting some sort of meta-tagging.

 The Java-based CS_2.002.74.jar, a.k.a 'CorpusSearch' doesn't run
 for some funny reason on ye olde G4 (have yet to try it on the
 Ubu-Box); but that doesn't really fuss me as ye olde academics
 have decided the parameters of their stuff in advance and my feet
 are too big for their shoes (hey; it's mixed metaphors time again).

 So; I am looking to build a Runrev data-miner / chewer / masticator
 / whatever; but, until I can work out what a .pos file can be opened with
 (so I can hae a keek at its formatin) the whole thing is on standby.
 Once I can see what a .pos file should look like in some sort of POS-file
 reader I can cobble together a suitably algorithmic sieve to make the
 file look like it should inside a text field prior to 'chewin the fat'.

 Google comes up with unintentionally witty results about 'point of sale'
 and so forth, as well as something about Arabic linguistic corpora,
 Chinese linguistic corpora and so forth (well, at least they are going
 in the right direction).

 Having written one of those slimy messages back, where one thanks people
 fulsomely and then shoves in the 'However'; I got a we cannot comment on
 other methods of accessing the corpus message. Well; at least I signed my
 name with
 my second name (Richmond) otherwise I would have had what the Americans
 call
 a 'Dear John' message . . .  :)

 Any help re POS-file readers would be most welcome.

 sincerely, Richmond Mathewson.
 ___
 use-revolution mailing list
 use-revolution@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your
 subscription preferences:
 http://lists.runrev.com/mailman/listinfo/use-revolution




-- 
-
Stephen Barncard
Back home in SF
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: dot POS files and Corpus Linguistics

2010-04-27 Thread Richmond Mathewson

 On 27/04/2010 22:03, stephen barncard wrote:

Richmond, it appears that .pos files are LOTUS NOTES,  among many others

http://file-extension.net/seeker/file_extension_pos

http://filext.com/file-extension/POS

http://en.wikipedia.org/wiki/Lotus_Notes

http://www.computerfileextensions.com/file-extensions.php/POS

FILE FORMAT:
http://www.x-ways.net/winhex/POS_Format_2_0.html



Thank you very much for your suggestion.

However, I got led up that garden path and spent some time mucking
around with Lotus notes.

I then found out that in the case of corpus files POS means 'parts of 
speech'.

This is typical academia delighting in obscurantism.

Now for more 'fun':

Also bundled in the corpus are .psd files which, wait for it, are NOT
Adobe Photoshop files.

PSD: Probably Something Different ???
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: dot POS files and Corpus Linguistics

2010-04-27 Thread wayne durden
I think we can all agree RunRev is the best dev environment going but
suggesting dot.net is a POS may be going a little too far

On Tue, Apr 27, 2010 at 5:20 PM, Richmond Mathewson 
richmondmathew...@gmail.com wrote:

  On 27/04/2010 22:03, stephen barncard wrote:

 Richmond, it appears that .pos files are LOTUS NOTES,  among many others

 http://file-extension.net/seeker/file_extension_pos

 http://filext.com/file-extension/POS

 http://en.wikipedia.org/wiki/Lotus_Notes

 http://www.computerfileextensions.com/file-extensions.php/POS

 FILE FORMAT:
 http://www.x-ways.net/winhex/POS_Format_2_0.html


  Thank you very much for your suggestion.

 However, I got led up that garden path and spent some time mucking
 around with Lotus notes.

 I then found out that in the case of corpus files POS means 'parts of
 speech'.
 This is typical academia delighting in obscurantism.

 Now for more 'fun':

 Also bundled in the corpus are .psd files which, wait for it, are NOT
 Adobe Photoshop files.

 PSD: Probably Something Different ???

 ___
 use-revolution mailing list
 use-revolution@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your
 subscription preferences:
 http://lists.runrev.com/mailman/listinfo/use-revolution

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution