Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread John Benediktsson
If you get lost in path land you can always take a break and use the 
/full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:
 
 Ah! Thanks, Joe- 
 Great tip; should clear up the issue with which. I am indeed starting 
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can 
 address commands like couchdb 
 via a process, I should be able to invoke docsplit that way as well, even 
 though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its extraction 
 work. Interesting.
 
 I'll go study the Mac dev doc you point to,  see what I can glean from there.
 
 Back to the books,
 ~cw
 
 
 
 
 
 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:
 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:
 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:
 
 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f
 
 IN: scratchpad python which .
 /usr/bin/python
 
 - The trouble appears to be with reporting my PATH properly, via getenv:
 
 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin
 
 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin
 
 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline
 
 - Here's my actual PATH, as seen in the terminal:
 
 ➜  ~ git:(master) ✗ echo $PATH
 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin
 
 - whereby which correctly finds couchdb:
 
 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb
 
 So, Factor's which (et al.) doesn't search beyond 
 /usr/bin:/bin:/usr/sbin:/sbin.
 
 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.
 
 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH is
 augmented in my .zshrc. I don't understand why the libc function doesn't 
 read it. Odd, indeed!
 
 If you're starting Factor from the Finder, you're not going to get a PATH 
 set from your .profile or other shell dotfiles, since UI apps are launched 
 under the loginwindow session and not under any shell. To set environment 
 variables for UI apps, try setting them in ~/.MacOSX/environment.plist:
 
  
 https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html
 
 -Joe
 
 
 
 -- 
 ~ Memento Amori
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Hi John-
Beg pardon, I should have mentioned earlier that since docsplit plants a
.txt file in the target pdf's
directory on its own, with no other output, I had gone the route you
suggested, but to no avail, i.e.,

docsplit text --no-clean -l path run-process drop

In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
chi_sim long_gu001.pdf
works fine. The surprise is that, in the listener, the phrase:

cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
long_gu001.pdf run-process .

- returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit
in the command.

The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves
to /usr/local/Cellar/ruby/2.1.0/bin/docsplit
(installed w/ homebrew). There I find this ruby script:

require 'rubygems'

version = = 0

if ARGV.first
  str = ARGV.first
  str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding
  if str =~ /\A_(.*)_\z/
version = $1
ARGV.shift
  end
end

gem 'docsplit', version
load Gem.bin_path('docsplit', 'docsplit', version)

If I manage to decipher this, I'll try to translate it in Factor, and
invoke docsplit that way.
That should keep me busy for a while. Worth a try, though I know zip about
ruby. Once past
this boondoggle, I already have Factor code that walks the tree  collates
the files.

Thanks!
~cw




On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote:

 If you get lost in path land you can always take a break and use the
 /full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:

 Ah! Thanks, Joe-
 Great tip; should clear up the issue with which. I am indeed starting
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can
 address commands like couchdb
 via a process, I should be able to invoke docsplit that way as well,
 even though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its
 extraction work. Interesting.

 I'll go study the Mac dev doc you point to,  see what I can glean from
 there.

 Back to the books,
 ~cw





 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH
 is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a
 PATH set from your .profile or other shell dotfiles, since UI apps are
 launched under the loginwindow session and not under any shell. To set
 environment variables for UI apps, try setting them in
 ~/.MacOSX/environment.plist:


 https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html

 -Joe




 --
 *~ Memento Amori*




-- 
*~ Memento Amori*
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
It's probably easiest to specify the full path to the file, like I did
in my previous message.  Combined with the full path to the docsplit
binary/link (for your particular problem), it should theoretically
work fine:

/full/path/to/docsplit text --no-clean -l chi_sim
/path/to/1_long_gu/long_gu001.pdf try-process

On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
 Hi John-
 Beg pardon, I should have mentioned earlier that since docsplit plants a
 .txt file in the target pdf's
 directory on its own, with no other output, I had gone the route you
 suggested, but to no avail, i.e.,

 docsplit text --no-clean -l path run-process drop

 In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf
 works fine. The surprise is that, in the listener, the phrase:

 cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf
 run-process .

 - returns with status 0, but leaves no file. Ditto using
 /full/path/to/docsplit in the command.

 The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit
 (installed w/ homebrew). There I find this ruby script:

 require 'rubygems'

 version = = 0

 if ARGV.first
   str = ARGV.first
   str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding
   if str =~ /\A_(.*)_\z/
 version = $1
 ARGV.shift
   end
 end

 gem 'docsplit', version
 load Gem.bin_path('docsplit', 'docsplit', version)

 If I manage to decipher this, I'll try to translate it in Factor, and invoke
 docsplit that way.
 That should keep me busy for a while. Worth a try, though I know zip about
 ruby. Once past
 this boondoggle, I already have Factor code that walks the tree  collates
 the files.

 Thanks!
 ~cw




 On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote:

 If you get lost in path land you can always take a break and use the
 /full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:

 Ah! Thanks, Joe-
 Great tip; should clear up the issue with which. I am indeed starting
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can
 address commands like couchdb
 via a process, I should be able to invoke docsplit that way as well,
 even though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its
 extraction work. Interesting.

 I'll go study the Mac dev doc you point to,  see what I can glean from
 there.

 Back to the books,
 ~cw





 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH
 is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a PATH
 set from your .profile or other shell dotfiles, since UI apps are launched
 under the loginwindow session and not under any shell. To set environment
 variables for UI apps, try setting them in ~/.MacOSX/environment.plist:


 https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html

 -Joe




 --
 ~ Memento Amori




 --
 ~ Memento Amori

 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
As a follow-up, from Factor you can use `with-directory-files`
(http://docs.factorcode.org/content/word-with-directory-files,io.directories.html)
and `absolute-path`
(http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html)
to get full paths to the files in some directory:

```
IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
with-directory-files
/home/alex/factor/core/generic
/home/alex/factor/core/parser
/home/alex/factor/core/sorting
[etc]
```


On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote:
 It's probably easiest to specify the full path to the file, like I did
 in my previous message.  Combined with the full path to the docsplit
 binary/link (for your particular problem), it should theoretically
 work fine:

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
 Hi John-
 Beg pardon, I should have mentioned earlier that since docsplit plants a
 .txt file in the target pdf's
 directory on its own, with no other output, I had gone the route you
 suggested, but to no avail, i.e.,

 docsplit text --no-clean -l path run-process drop

 In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf
 works fine. The surprise is that, in the listener, the phrase:

 cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf
 run-process .

 - returns with status 0, but leaves no file. Ditto using
 /full/path/to/docsplit in the command.

 The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit
 (installed w/ homebrew). There I find this ruby script:

 require 'rubygems'

 version = = 0

 if ARGV.first
   str = ARGV.first
   str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding
   if str =~ /\A_(.*)_\z/
 version = $1
 ARGV.shift
   end
 end

 gem 'docsplit', version
 load Gem.bin_path('docsplit', 'docsplit', version)

 If I manage to decipher this, I'll try to translate it in Factor, and invoke
 docsplit that way.
 That should keep me busy for a while. Worth a try, though I know zip about
 ruby. Once past
 this boondoggle, I already have Factor code that walks the tree  collates
 the files.

 Thanks!
 ~cw




 On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote:

 If you get lost in path land you can always take a break and use the
 /full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:

 Ah! Thanks, Joe-
 Great tip; should clear up the issue with which. I am indeed starting
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can
 address commands like couchdb
 via a process, I should be able to invoke docsplit that way as well,
 even though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its
 extraction work. Interesting.

 I'll go study the Mac dev doc you point to,  see what I can glean from
 there.

 Back to the books,
 ~cw





 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH
 is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a PATH
 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Hi Alex-

Thanks, I did try

/full/path/to/docsplit text --no-clean -l chi_sim
/path/to/1_long_gu/long_gu001.pdf try-process

using both the symlink and the resolved executable:

/usr/local/opt/ruby/bin/docsplit
/usr/local/Cellar/ruby/2.1.0/bin/docsplit

but still no response, still status 0. A lightbulb went on, and I set a
duplicate symlink
in /usr/bin/docsplit (where Factor's which can find it) straight to
/usr/local/Cellar/ruby/2.1.0/bin/docsplit:

IN: scratchpad docsplit which .
/usr/bin/docsplit

-ok, but still no success with anything in io.launcher. Oy!

I see on the web that this problem calling docsplit isn't confined to
Factor. Help calls appear
in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797
and
stackoverflow re
pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
Let me dig around some more; this sticky wicket
must have a workaround...

I'll dig around some more.
~cw




On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html)
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should theoretically
  work fine:
 
  /full/path/to/docsplit text --no-clean -l chi_sim
  /path/to/1_long_gu/long_gu001.pdf try-process
 
  On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
  Hi John-
  Beg pardon, I should have mentioned earlier that since docsplit plants a
  .txt file in the target pdf's
  directory on its own, with no other output, I had gone the route you
  suggested, but to no avail, i.e.,
 
  docsplit text --no-clean -l path run-process drop
 
  In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
  chi_sim long_gu001.pdf
  works fine. The surprise is that, in the listener, the phrase:
 
  cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
 long_gu001.pdf
  run-process .
 
  - returns with status 0, but leaves no file. Ditto using
  /full/path/to/docsplit in the command.
 
  The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
  /usr/local/Cellar/ruby/2.1.0/bin/docsplit
  (installed w/ homebrew). There I find this ruby script:
 
  require 'rubygems'
 
  version = = 0
 
  if ARGV.first
str = ARGV.first
str = str.dup.force_encoding(BINARY) if str.respond_to?
 :force_encoding
if str =~ /\A_(.*)_\z/
  version = $1
  ARGV.shift
end
  end
 
  gem 'docsplit', version
  load Gem.bin_path('docsplit', 'docsplit', version)
 
  If I manage to decipher this, I'll try to translate it in Factor, and
 invoke
  docsplit that way.
  That should keep me busy for a while. Worth a try, though I know zip
 about
  ruby. Once past
  this boondoggle, I already have Factor code that walks the tree 
 collates
  the files.
 
  Thanks!
  ~cw
 
 
 
 
  On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com
 wrote:
 
  If you get lost in path land you can always take a break and use the
  /full/path/to/docsplit.
 
  On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:
 
  Ah! Thanks, Joe-
  Great tip; should clear up the issue with which. I am indeed starting
  Factor in the Finder. I'll try adjusting the plist.
  Maybe that even has something to do with my docsplit puzzle. Since I
 can
  address commands like couchdb
  via a process, I should be able to invoke docsplit that way as well,
  even though htop shows me that docsplit
  itself spawns sub-processes, like poppler  tesseract, to do its
  extraction work. Interesting.
 
  I'll go study the Mac dev doc you point to,  see what I can glean from
  there.
 
  Back to the books,
  ~cw
 
 
 
 
 
  On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:
 
  On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com
 wrote:
 
  Hi -
  Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
  still Version 0.97. Same issue with Factor's which:
 
  IN: scratchpad USE: tools.which
  IN: scratchpad couchdb which .
  f
 
  IN: scratchpad python which .
  /usr/bin/python
 
  - The trouble appears to be with reporting my PATH properly, via
 getenv:
 
  IN: scratchpad USE: environment
  IN: scratchpad PATH os-env .
  /usr/bin:/bin:/usr/sbin:/sbin
 
  IN: scratchpad USE: unix.ffi
  IN: scratchpad PATH getenv .
  /usr/bin:/bin:/usr/sbin:/sbin
 
  IN: 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
Strange.  Well, not actually strange, since many programs aren't great
about return codes...but still!  I decided to re-enact the issue by
removing /usr/local/bin (where my docsplit was installed) from my PATH,
starting Factor, and trying it out.  Looks like docsplit is dumping the txt
file in the current working directory:


IN: scratchpad docsplit which .
f
IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
run-process status .
255
IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
/tmp/thesis.pdf run-process status .
0
IN: scratchpad /tmp/thesis.txt exists? .
f
IN: scratchpad thesis.txt exists? .
t

Seems as though you need to tell Factor to run in another working directory:

IN: scratchpad /tmp [
/usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
run-process status .
 ] with-directory
0
IN: scratchpad /tmp/thesis.txt exists? .
t

By the way, turns out you can set the `environment` slot of an io.launcher
process, so I was thinking maybe that would help, but...

IN: scratchpad process
docsplit text --no-clean -l eng /tmp/thesis.pdf command
/tmp/stdout.txt stdout
+stdout+ stderr
{ { PATH /usr/local/bin } } environment
run-process status .
1
IN: scratchpad /tmp/stdout.txt utf8 file-contents print
sh: 1: pdftotext: not found

Damn. No dice. Looks like you'll have to fix the PATH issue on the system
itself.

Anyway, hope that helps.

(P.S.: Charles, if you're getting this message again, it's because I think
GMail might've screwed up the reply behavior and didn't send this to the
list, so I'm re-sending it.)



On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 
 and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html)
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should theoretically
  work fine:
 
  /full/path/to/docsplit text --no-clean -l chi_sim
  /path/to/1_long_gu/long_gu001.pdf try-process
 
  On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
  Hi John-
  Beg pardon, I should have mentioned earlier that since docsplit plants
 a
  .txt file in the target pdf's
  directory on its own, with no other output, I had gone the route you
  suggested, but to no avail, i.e.,
 
  docsplit text --no-clean -l path run-process drop
 
  In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
  chi_sim long_gu001.pdf
  works fine. The surprise is that, in the listener, the phrase:
 
  cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
 long_gu001.pdf
  run-process .
 
  - returns with status 0, but leaves no file. Ditto using
  /full/path/to/docsplit in the command.
 
  The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
  /usr/local/Cellar/ruby/2.1.0/bin/docsplit
  (installed w/ homebrew). There I find this ruby script:
 
  require 'rubygems'
 
  version = = 0
 
  if ARGV.first
str = ARGV.first
str = str.dup.force_encoding(BINARY) if str.respond_to?
 :force_encoding
if str =~ /\A_(.*)_\z/
  version = $1
  ARGV.shift
end
  end
 
  gem 'docsplit', version
  load Gem.bin_path('docsplit', 'docsplit', version)
 
  If I manage to decipher this, I'll try to translate it in Factor, and
 invoke
  docsplit that way.
  That should keep me busy for a while. Worth a 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Yeah, Alex-
I would have thought the cd in my compound command string would take care
of he current directory issue.
There's another thread about this
problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat
finds docsplit returning files in the root directory - on my system
no files are winding up there.
Let me see what I can do w/ your path/environment suggestions.

Gonna be another long night...
Thanks much,
~cw


On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Strange.  Well, not actually strange, since many programs aren't great
 about return codes...but still!  I decided to re-enact the issue by
 removing /usr/local/bin (where my docsplit was installed) from my PATH,
 starting Factor, and trying it out.  Looks like docsplit is dumping the txt
 file in the current working directory:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
 255
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf run-process status .
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad thesis.txt exists? .
 t

 Seems as though you need to tell Factor to run in another working
 directory:

 IN: scratchpad /tmp [
 /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
  ] with-directory
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 t

 By the way, turns out you can set the `environment` slot of an io.launcher
 process, so I was thinking maybe that would help, but...

 IN: scratchpad process
 docsplit text --no-clean -l eng /tmp/thesis.pdf command
 /tmp/stdout.txt stdout
 +stdout+ stderr
 { { PATH /usr/local/bin } } environment
 run-process status .
 1
 IN: scratchpad /tmp/stdout.txt utf8 file-contents print
 sh: 1: pdftotext: not found

 Damn. No dice. Looks like you'll have to fix the PATH issue on the system
 itself.

 Anyway, hope that helps.

 (P.S.: Charles, if you're getting this message again, it's because I think
 GMail might've screwed up the reply behavior and didn't send this to the
 list, so I'm re-sending it.)



 On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 
 and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html
 )
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com
 wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should theoretically
  work fine:
 
  /full/path/to/docsplit text --no-clean -l chi_sim
  /path/to/1_long_gu/long_gu001.pdf try-process
 
  On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com
 wrote:
  Hi John-
  Beg pardon, I should have mentioned earlier that since docsplit
 plants a
  .txt file in the target pdf's
  directory on its own, with no other output, I had gone the route you
  suggested, but to no avail, i.e.,
 
  docsplit text --no-clean -l path run-process drop
 
  In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
  chi_sim long_gu001.pdf
  works fine. The surprise is that, in the listener, the phrase:
 
  cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
 long_gu001.pdf
  run-process .
 
  - returns with status 0, but leaves no file. Ditto using
  /full/path/to/docsplit in the command.
 
  The docsplit bin alias 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
Thing is, `cd` isn't a binary that Factor can execute in a process.  It's
just a shell command implemented by bash or zsh or whatever you use.  Same
with the semicolon syntax, for that matter.  You might try to finagle
something like

IN: scratchpad { sh -c cd /tmp ; pwd } utf8 [ contents . ]
with-process-reader
/tmp\n

Not sure how the PATH stuff will work out with that, though.

You could also try just using the `-o` flag to docsplit.  Again,
deliberately messing up my PATH so Factor can't run docsplit directly:

IN: scratchpad docsplit which .
f
IN: scratchpad /tmp/thesis.pdf exists? .
t
IN: scratchpad /tmp/thesis.txt exists? .
f
IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
/tmp/thesis.pdf -o /tmp try-process
IN: scratchpad /tmp/thesis.txt exists? .
t



On Sun, Feb 9, 2014 at 5:02 PM, CW Alston cwalsto...@gmail.com wrote:

 Yeah, Alex-
 I would have thought the cd in my compound command string would take care
 of he current directory issue.
 There's another thread about this 
 problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat
  finds docsplit returning files in the root directory - on my system
 no files are winding up there.
 Let me see what I can do w/ your path/environment suggestions.

 Gonna be another long night...
 Thanks much,
 ~cw


 On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Strange.  Well, not actually strange, since many programs aren't great
 about return codes...but still!  I decided to re-enact the issue by
 removing /usr/local/bin (where my docsplit was installed) from my PATH,
 starting Factor, and trying it out.  Looks like docsplit is dumping the txt
 file in the current working directory:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
 255
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf run-process status .
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad thesis.txt exists? .
 t

 Seems as though you need to tell Factor to run in another working
 directory:

 IN: scratchpad /tmp [
 /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
  ] with-directory
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 t

 By the way, turns out you can set the `environment` slot of an
 io.launcher process, so I was thinking maybe that would help, but...

 IN: scratchpad process
 docsplit text --no-clean -l eng /tmp/thesis.pdf command
 /tmp/stdout.txt stdout
 +stdout+ stderr
 { { PATH /usr/local/bin } } environment
 run-process status .
 1
 IN: scratchpad /tmp/stdout.txt utf8 file-contents print
 sh: 1: pdftotext: not found

 Damn. No dice. Looks like you'll have to fix the PATH issue on the system
 itself.

 Anyway, hope that helps.

 (P.S.: Charles, if you're getting this message again, it's because I
 think GMail might've screwed up the reply behavior and didn't send this to
 the list, so I'm re-sending it.)



 On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 
 and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.comwrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (
 http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html
 )
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com
 wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Lord love a duck, Alex - I didn't realize that builtins like `cd` are
'existentially' different than utilities like `cat` -
(I only speak pidgin unix; bites me often). Thanks for the heads-up.

Okay... I'll try moving|copying my target directory into my home folder, to
obviate the need for any cd'ing (I hope),
 pass docsplit an array of pdfs and flags; or maybe have docsplit iterate
over a tmp file containing lines like:

chi_sim long_gu001.pdf
eng long_gu002.pdf
eng long_gu003.pdf ...

Probably have to do this in a script. Never a dull moment.
~cw



On Sun, Feb 9, 2014 at 6:34 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Thing is, `cd` isn't a binary that Factor can execute in a process.  It's
 just a shell command implemented by bash or zsh or whatever you use.  Same
 with the semicolon syntax, for that matter.  You might try to finagle
 something like

 IN: scratchpad { sh -c cd /tmp ; pwd } utf8 [ contents . ]
 with-process-reader
 /tmp\n

 Not sure how the PATH stuff will work out with that, though.

 You could also try just using the `-o` flag to docsplit.  Again,
 deliberately messing up my PATH so Factor can't run docsplit directly:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad /tmp/thesis.pdf exists? .

 t
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf -o /tmp try-process

 IN: scratchpad /tmp/thesis.txt exists? .
 t



 On Sun, Feb 9, 2014 at 5:02 PM, CW Alston cwalsto...@gmail.com wrote:

 Yeah, Alex-
 I would have thought the cd in my compound command string would take care
 of he current directory issue.
 There's another thread about this 
 problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat
  finds docsplit returning files in the root directory - on my system
 no files are winding up there.
 Let me see what I can do w/ your path/environment suggestions.

 Gonna be another long night...
 Thanks much,
 ~cw


 On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Strange.  Well, not actually strange, since many programs aren't great
 about return codes...but still!  I decided to re-enact the issue by
 removing /usr/local/bin (where my docsplit was installed) from my PATH,
 starting Factor, and trying it out.  Looks like docsplit is dumping the txt
 file in the current working directory:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
 255
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf run-process status .
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad thesis.txt exists? .
 t

 Seems as though you need to tell Factor to run in another working
 directory:

 IN: scratchpad /tmp [
 /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
  ] with-directory
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 t

 By the way, turns out you can set the `environment` slot of an
 io.launcher process, so I was thinking maybe that would help, but...

 IN: scratchpad process
 docsplit text --no-clean -l eng /tmp/thesis.pdf command
 /tmp/stdout.txt stdout
 +stdout+ stderr
 { { PATH /usr/local/bin } } environment
 run-process status .
 1
 IN: scratchpad /tmp/stdout.txt utf8 file-contents print
 sh: 1: pdftotext: not found

 Damn. No dice. Looks like you'll have to fix the PATH issue on the
 system itself.

 Anyway, hope that helps.

 (P.S.: Charles, if you're getting this message again, it's because I
 think GMail might've screwed up the reply behavior and didn't send this to
 the list, so I'm re-sending it.)



 On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797
  and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.comwrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-08 Thread Alex Vondrak
I can't tell you what's wrong with code you haven't provided, but...

```
IN: scratchpad USING: io.files io.launcher io.encodings.ascii tools.which ;
IN: scratchpad docsplit which .
/usr/local/bin/docsplit
IN: scratchpad /tmp/cv.pdf exists? .
t
IN: scratchpad /tmp/cv.txt exists? .
f
IN: scratchpad docsplit text --no-clean -l eng /tmp/cv.pdf try-process
IN: scratchpad /tmp/cv.txt exists? .
t
IN: scratchpad /tmp/cv.txt ascii file-lines first .
Alex Vondrak
```


On Sat, Feb 8, 2014 at 2:32 AM, CW Alston cwalsto...@gmail.com wrote:
 Hi folks -

 I am thrilled to find a versatile open-source optical character recognition
 engine called docsplit. I've got it installed easily as a ruby gem,  it
 works
 just great on my Mac as a shell command (it also provides a ruby module):

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit
 ➜  ~ git:(master) ✗

 I need such a tool to extract text from a deep directory tree, with a couple
 thousand
 folders. Each leaf folder contains 3-6 scanned pdfs (in Chinese  English),
 from which
 docsplit makes a plaintext (.txt) file with the same basename, deposited in
 the same
 leaf directory. My Factor vocab can easily visit each leaf dir  prepare to
 pass each pdf
 there to docsplit in the format it happily handles in the terminal (I use
 oh-my-zsh  iTerm2).
 My Factor code chokes on this intermediate step, trying to call docsplit.

 Going to the terminal, I have to first cd to the directory containing the
 pdfs, e.g.,

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu

 then call docsplit with the appropriate flags on each pdf:

 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l chi_sim
 long_gu001.pdf
 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l eng long_gu002.pdf

 etc., for each pdf,  docsplit gives back a bunch of text files in the dir
 like

 /path/to/1_long_gu/long_gu001.txt

 In the terminal, even a compound phrase like the following works without a
 hitch:

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf ; docsplit text --no-clean -l eng long_gu002.pdf ;
 docsplit text --no-clean -l eng long_gu003.pdf ;...
 ➜  1_long_gu git:(master) ✗

 So, working from the terminal, I wind up with a series of text files in
 /path/to/1_long_gu
 that my Factor vocab amalgamates into a single text file (with whitespace in
 filename), e.g.,
 /path/to/1_long_gu/long gu.txt, which I can edit for mistakes, and upload to
 a couchdb database.
 Joy!

 But I haven't been able to work out how to accomplish this docsplit call
 from Factor code.
 I have no problem traversing the directory tree (Factor's word each-file 
 the like come in
 very handy). I've experimented with io.launcher, io.pipes, shell scripts
 (bash, zsh, factor),
  autoload shell functions, but flunked out. No errors with io.launcher
 tries; just no result.
 Need to learn something here. I routinely launch couchdb as a detached
 process.

 It would be such a boon to use docsplit in Factor. After a couple weeks lost
 at sea with this,
 I'm broadcasting a Mayday. Any suggestions?

 Thanks in advance,
 ~cw

 --
 ~ Memento Amori

 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-08 Thread John Benediktsson
Well if you want process output, you can do something like:

{ docsplit text --no-clean -l path } utf8 [ lines ]
with-process-reader

or without output, using a single command string:

docsplit text --no-clean -l path run-process drop

You can docsplit a directory of files:

: docsplit ( file -- )
{ docsplit text --no-clean -l }
swap prefix run-process drop ;

: docsplit-all ( path -- )
directory-files [ docsplit ] each ;

And concatenate all the files in a directory:

# bash
ls *.factor | sort | xargs -I '{}' cat '{}'

# factor
: cat-results ( path -- )
directory-files [ .txt tail? ] filter natural-sort
[ file-lines ] map concat ;

Or something like that, which part are you having problems with?

Best,
John.



On Sat, Feb 8, 2014 at 2:32 AM, CW Alston cwalsto...@gmail.com wrote:

 Hi folks -

 I am thrilled to find a versatile open-source optical character recognition
 engine called docsplit http://documentcloud.github.io/docsplit/. I've
 got it installed easily as a ruby gem,  it works
 just great on my Mac as a shell command (it also provides a ruby module):

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit
 ➜  ~ git:(master) ✗

 I need such a tool to extract text from a deep directory tree, with a
 couple thousand
 folders. Each leaf folder contains 3-6 scanned pdfs (in Chinese 
 English), from which
 docsplit makes a plaintext (.txt) file with the same basename, deposited
 in the same
 leaf directory. My Factor vocab can easily visit each leaf dir  prepare
 to pass each pdf
 there to docsplit in the format it happily handles in the terminal (I use
 oh-my-zsh  iTerm2).
 My Factor code chokes on this intermediate step, trying to call docsplit.

 Going to the terminal, I have to first cd to the directory containing the
 pdfs, e.g.,

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu

 then call docsplit with the appropriate flags on each pdf:

 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l chi_sim
 long_gu001.pdf
 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l eng long_gu002.pdf

 etc., for each pdf,  docsplit gives back a bunch of text files in the dir
 like

 /path/to/1_long_gu/long_gu001.txt

 In the terminal, even a compound phrase like the following works without a
 hitch:

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf ; docsplit text --no-clean -l eng long_gu002.pdf ;
 docsplit text --no-clean -l eng long_gu003.pdf ;...
 ➜  1_long_gu git:(master) ✗

 So, working from the terminal, I wind up with a series of text files in
 /path/to/1_long_gu
 that my Factor vocab amalgamates into a single text file (with whitespace
 in filename), e.g.,
 /path/to/1_long_gu/long gu.txt, which I can edit for mistakes, and upload
 to a couchdb database.
 Joy!

 But I haven't been able to work out how to accomplish this docsplit call
 from Factor code.
 I have no problem traversing the directory tree (Factor's word each-file 
 the like come in
 very handy). I've experimented with io.launcher, io.pipes, shell scripts
 (bash, zsh, factor),
  autoload shell functions, but flunked out. No errors with io.launcher
 tries; just no result.
 Need to learn something here. I routinely launch couchdb as a detached
 process.

 It would be such a boon to use docsplit in Factor. After a couple weeks
 lost at sea with this,
 I'm broadcasting a Mayday. Any suggestions?

 Thanks in advance,
 ~cw

 --
 *~ Memento Amori*


 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-08 Thread CW Alston
Thanks for the replies. Maybe a clue here - I get this from which:

IN: scratchpad USE: tools.which
IN: scratchpad docsplit which .
f
IN: scratchpad couchdb which .
f
IN: scratchpad ruby which  .
f

Whereas in the terminal:

➜  ~ git:(master) ✗ which docsplit
/usr/local/opt/ruby/bin/docsplit

➜  ~ git:(master) ✗ which couchdb
/usr/local/bin/couchdb

➜  ~ git:(master) ✗ which ruby
/usr/local/bin/ruby

Let me try moving up to the most recent development release
 see if the problem disappears. I'll get back to you.

Best,
~cw



On Sat, Feb 8, 2014 at 7:42 AM, John Benediktsson mrj...@gmail.com wrote:

 Well if you want process output, you can do something like:

 { docsplit text --no-clean -l path } utf8 [ lines ]
 with-process-reader

 or without output, using a single command string:

 docsplit text --no-clean -l path run-process drop

 You can docsplit a directory of files:

 : docsplit ( file -- )
 { docsplit text --no-clean -l }
 swap prefix run-process drop ;

 : docsplit-all ( path -- )
 directory-files [ docsplit ] each ;

 And concatenate all the files in a directory:

 # bash
 ls *.factor | sort | xargs -I '{}' cat '{}'

 # factor
 : cat-results ( path -- )
 directory-files [ .txt tail? ] filter natural-sort
 [ file-lines ] map concat ;

 Or something like that, which part are you having problems with?

 Best,
 John.



 On Sat, Feb 8, 2014 at 2:32 AM, CW Alston cwalsto...@gmail.com wrote:

 Hi folks -

 I am thrilled to find a versatile open-source optical character
 recognition
 engine called docsplit http://documentcloud.github.io/docsplit/. I've
 got it installed easily as a ruby gem,  it works
 just great on my Mac as a shell command (it also provides a ruby module):

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit
 ➜  ~ git:(master) ✗

 I need such a tool to extract text from a deep directory tree, with a
 couple thousand
 folders. Each leaf folder contains 3-6 scanned pdfs (in Chinese 
 English), from which
 docsplit makes a plaintext (.txt) file with the same basename, deposited
 in the same
 leaf directory. My Factor vocab can easily visit each leaf dir  prepare
 to pass each pdf
 there to docsplit in the format it happily handles in the terminal (I use
 oh-my-zsh  iTerm2).
 My Factor code chokes on this intermediate step, trying to call docsplit.

 Going to the terminal, I have to first cd to the directory containing the
 pdfs, e.g.,

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu

 then call docsplit with the appropriate flags on each pdf:

 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l chi_sim
 long_gu001.pdf
 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l eng long_gu002.pdf

 etc., for each pdf,  docsplit gives back a bunch of text files in the
 dir like

 /path/to/1_long_gu/long_gu001.txt

 In the terminal, even a compound phrase like the following works without
 a hitch:

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf ; docsplit text --no-clean -l eng long_gu002.pdf ;
 docsplit text --no-clean -l eng long_gu003.pdf ;...
 ➜  1_long_gu git:(master) ✗

 So, working from the terminal, I wind up with a series of text files in
 /path/to/1_long_gu
 that my Factor vocab amalgamates into a single text file (with whitespace
 in filename), e.g.,
 /path/to/1_long_gu/long gu.txt, which I can edit for mistakes, and upload
 to a couchdb database.
 Joy!

 But I haven't been able to work out how to accomplish this docsplit call
 from Factor code.
 I have no problem traversing the directory tree (Factor's word each-file
  the like come in
 very handy). I've experimented with io.launcher, io.pipes, shell scripts
 (bash, zsh, factor),
  autoload shell functions, but flunked out. No errors with io.launcher
 tries; just no result.
 Need to learn something here. I routinely launch couchdb as a detached
 process.

 It would be such a boon to use docsplit in Factor. After a couple weeks
 lost at sea with this,
 I'm broadcasting a Mayday. Any suggestions?

 Thanks in advance,
 ~cw

 --
 *~ Memento Amori*


 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk





-- 
*~ Memento Amori*
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-08 Thread John Benediktsson
Thats odd, Factor's which just looks in the $PATH for your executable.

IN: scratchpad PATH os-env

You can read a bit about how its implemented cross-platform:

http://re-factor.blogspot.com/2013/01/which.html


On Sat, Feb 8, 2014 at 2:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Thanks for the replies. Maybe a clue here - I get this from which:

 IN: scratchpad USE: tools.which
 IN: scratchpad docsplit which .
 f
 IN: scratchpad couchdb which .
 f
 IN: scratchpad ruby which  .
 f

 Whereas in the terminal:

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 ➜  ~ git:(master) ✗ which ruby
 /usr/local/bin/ruby

 Let me try moving up to the most recent development release
  see if the problem disappears. I'll get back to you.

 Best,
 ~cw



 On Sat, Feb 8, 2014 at 7:42 AM, John Benediktsson mrj...@gmail.comwrote:

 Well if you want process output, you can do something like:

 { docsplit text --no-clean -l path } utf8 [ lines ]
 with-process-reader

 or without output, using a single command string:

 docsplit text --no-clean -l path run-process drop

 You can docsplit a directory of files:

 : docsplit ( file -- )
 { docsplit text --no-clean -l }
 swap prefix run-process drop ;

 : docsplit-all ( path -- )
 directory-files [ docsplit ] each ;

 And concatenate all the files in a directory:

 # bash
 ls *.factor | sort | xargs -I '{}' cat '{}'

 # factor
 : cat-results ( path -- )
 directory-files [ .txt tail? ] filter natural-sort
 [ file-lines ] map concat ;

 Or something like that, which part are you having problems with?

 Best,
 John.



 On Sat, Feb 8, 2014 at 2:32 AM, CW Alston cwalsto...@gmail.com wrote:

 Hi folks -

 I am thrilled to find a versatile open-source optical character
 recognition
 engine called docsplit http://documentcloud.github.io/docsplit/. I've
 got it installed easily as a ruby gem,  it works
 just great on my Mac as a shell command (it also provides a ruby module):

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit
 ➜  ~ git:(master) ✗

 I need such a tool to extract text from a deep directory tree, with a
 couple thousand
 folders. Each leaf folder contains 3-6 scanned pdfs (in Chinese 
 English), from which
 docsplit makes a plaintext (.txt) file with the same basename, deposited
 in the same
 leaf directory. My Factor vocab can easily visit each leaf dir  prepare
 to pass each pdf
 there to docsplit in the format it happily handles in the terminal (I
 use oh-my-zsh  iTerm2).
 My Factor code chokes on this intermediate step, trying to call docsplit.

 Going to the terminal, I have to first cd to the directory containing
 the pdfs, e.g.,

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu

 then call docsplit with the appropriate flags on each pdf:

 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l chi_sim
 long_gu001.pdf
 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l eng
 long_gu002.pdf

 etc., for each pdf,  docsplit gives back a bunch of text files in the
 dir like

 /path/to/1_long_gu/long_gu001.txt

 In the terminal, even a compound phrase like the following works without
 a hitch:

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf ; docsplit text --no-clean -l eng long_gu002.pdf ;
 docsplit text --no-clean -l eng long_gu003.pdf ;...
 ➜  1_long_gu git:(master) ✗

 So, working from the terminal, I wind up with a series of text files in
 /path/to/1_long_gu
 that my Factor vocab amalgamates into a single text file (with
 whitespace in filename), e.g.,
 /path/to/1_long_gu/long gu.txt, which I can edit for mistakes, and
 upload to a couchdb database.
 Joy!

 But I haven't been able to work out how to accomplish this docsplit call
 from Factor code.
 I have no problem traversing the directory tree (Factor's word each-file
  the like come in
 very handy). I've experimented with io.launcher, io.pipes, shell scripts
 (bash, zsh, factor),
  autoload shell functions, but flunked out. No errors with io.launcher
 tries; just no result.
 Need to learn something here. I routinely launch couchdb as a detached
 process.

 It would be such a boon to use docsplit in Factor. After a couple weeks
 lost at sea with this,
 I'm broadcasting a Mayday. Any suggestions?

 Thanks in advance,
 ~cw

 --
 *~ Memento Amori*


 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk





 --
 *~ Memento Amori*


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-08 Thread CW Alston
Hi -
Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
still Version 0.97. Same issue with Factor's which:

IN: scratchpad USE: tools.which
IN: scratchpad couchdb which .
f

IN: scratchpad python which .
/usr/bin/python

- The trouble appears to be with reporting my PATH properly, via getenv:

IN: scratchpad USE: environment
IN: scratchpad PATH os-env .
/usr/bin:/bin:/usr/sbin:/sbin

IN: scratchpad USE: unix.ffi
IN: scratchpad PATH getenv .
/usr/bin:/bin:/usr/sbin:/sbin

IN: scratchpad \ getenv see
USING: alien.c-types alien.syntax ;
IN: unix.ffi
LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
inline

- Here's my actual PATH, as seen in the terminal:

➜  ~ git:(master) ✗ echo $PATH
/usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

- whereby which correctly finds couchdb:

➜  ~ git:(master) ✗ which couchdb
/usr/local/bin/couchdb

So, Factor's which (et al.) doesn't search beyond
/usr/bin:/bin:/usr/sbin:/sbin.

Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
a clue as to how to rectify this short-sightedness via the libc getenv.

This is probably a side issue to my docsplit quandary (but maybe not).
Anyone see a way to report my actual PATH to which in Factor? My PATH is
augmented in my .zshrc. I don't understand why the libc function doesn't
read it. Odd, indeed!

~cw


On Sat, Feb 8, 2014 at 4:39 PM, John Benediktsson mrj...@gmail.com wrote:

 Thats odd, Factor's which just looks in the $PATH for your executable.

 IN: scratchpad PATH os-env

 You can read a bit about how its implemented cross-platform:

 http://re-factor.blogspot.com/2013/01/which.html


 On Sat, Feb 8, 2014 at 2:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Thanks for the replies. Maybe a clue here - I get this from which:

 IN: scratchpad USE: tools.which
 IN: scratchpad docsplit which .
 f
 IN: scratchpad couchdb which .
 f
 IN: scratchpad ruby which  .
 f

 Whereas in the terminal:

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 ➜  ~ git:(master) ✗ which ruby
 /usr/local/bin/ruby

 Let me try moving up to the most recent development release
  see if the problem disappears. I'll get back to you.

 Best,
 ~cw



 On Sat, Feb 8, 2014 at 7:42 AM, John Benediktsson mrj...@gmail.comwrote:

 Well if you want process output, you can do something like:

 { docsplit text --no-clean -l path } utf8 [ lines ]
 with-process-reader

 or without output, using a single command string:

 docsplit text --no-clean -l path run-process drop

 You can docsplit a directory of files:

 : docsplit ( file -- )
 { docsplit text --no-clean -l }
 swap prefix run-process drop ;

 : docsplit-all ( path -- )
 directory-files [ docsplit ] each ;

 And concatenate all the files in a directory:

 # bash
 ls *.factor | sort | xargs -I '{}' cat '{}'

 # factor
 : cat-results ( path -- )
 directory-files [ .txt tail? ] filter natural-sort
 [ file-lines ] map concat ;

 Or something like that, which part are you having problems with?

 Best,
 John.



 On Sat, Feb 8, 2014 at 2:32 AM, CW Alston cwalsto...@gmail.com wrote:

 Hi folks -

 I am thrilled to find a versatile open-source optical character
 recognition
 engine called docsplit http://documentcloud.github.io/docsplit/.
 I've got it installed easily as a ruby gem,  it works
 just great on my Mac as a shell command (it also provides a ruby
 module):

 ➜  ~ git:(master) ✗ which docsplit
 /usr/local/opt/ruby/bin/docsplit
 ➜  ~ git:(master) ✗

 I need such a tool to extract text from a deep directory tree, with a
 couple thousand
 folders. Each leaf folder contains 3-6 scanned pdfs (in Chinese 
 English), from which
 docsplit makes a plaintext (.txt) file with the same basename,
 deposited in the same
 leaf directory. My Factor vocab can easily visit each leaf dir 
 prepare to pass each pdf
 there to docsplit in the format it happily handles in the terminal (I
 use oh-my-zsh  iTerm2).
 My Factor code chokes on this intermediate step, trying to call
 docsplit.

 Going to the terminal, I have to first cd to the directory containing
 the pdfs, e.g.,

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu

 then call docsplit with the appropriate flags on each pdf:

 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l chi_sim
 long_gu001.pdf
 ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l eng
 long_gu002.pdf

 etc., for each pdf,  docsplit gives back a bunch of text files in the
 dir like

 /path/to/1_long_gu/long_gu001.txt

 In the terminal, even a compound phrase like the following works
 without a hitch:

 ➜  ~ git:(master) ✗ cd /path/to/1_long_gu ; 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-08 Thread Joe Groff
On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a PATH
set from your .profile or other shell dotfiles, since UI apps are launched
under the loginwindow session and not under any shell. To set environment
variables for UI apps, try setting them in ~/.MacOSX/environment.plist:


https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html

-Joe
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk