How to pass parameters from the UNIX command line ?

2003-08-14 Thread Fabián R. Breschi
Dear List,

It seems a silly one, anyway I'm somewhat confused about how to pass 
parameters from the UNIX command line:

Suppose I have my working PERL script and passed parameter on an URL as:

http://server.domain.com/cgi-bin/MyProcedure.pl?cust_id=x

I'd like to make a cron job to source the above PERL script as from the 
command line to resemble something like:

perl /usr/local/apache/cgi-bin/MyProcedure.plneed to pass the 
parameter here as cust_id=x

How can I tell the procedure to consider cust_id=x as it's input ?

Many thanks indeed for your patiente.

Fabian.



Re: General interest question: PDF contents handling in PostgreSQL.

2002-11-27 Thread Fabián R. Breschi
Thanks a lot for the advise,

Let you know.

Fabian.

Perrin Harkins wrote:

 Fabián R. Breschi wrote:

I wonder if using ModPerl and PostgreSQL there's any possibility
 to resemble what in Oracle is called 'Intermedia', in this particular
 case parsing/indexing content of PDF files inside PostgreSQL as  a
 LOB or alternatively as a flat OS file with metadata parsed/indexed
 from it into the RDBMS.


 You can easilly add this to DBIx::FullTextSearch.  All you need to do
 is write a simple frontend that uses a PDF reading module to extract
 the text.  However, it uses MySQL rather than PostgreSQL.

 - Perrin







Re: General interest question: PDF contents handling in PostgreSQL.

2002-11-27 Thread Fabián R. Breschi



Rob Nagler wrote:

  
I agree with you with the overhead provoked by the Oracle solution. Particularly, using intermedia with the 'Internet File System' option of 8i/9i things get extremely complex in terms of manageability. On the other hand, the user friendly interface that allows to drop a file into the DB and get indexed on the fly has a high cost in terms of system 

It isn't indexed on the fly in our version (8i).  Has this changed?You have to run the indexer regularly, so in this it is no better thanexternal indexing solutions.  Indeed, one of the big problems is thatyou can't qualify the query *prior* to index search afaik.  It seemsto search the entire index always.  In our case, this is extremelycostly, because our space naturally divides, and isolated indexeswould solve the problem much more efficiently.

Oracle claims to get the file search within the DB at a fraction of time
respect to MS flat files in the IFS solution with 8i onwards (Enterprise
edition), obviously it doesn't mean that indexing performs well compared
to an analog solution. Didn't pay attention to the fact of the reindexing
after dropping a doc inside IFS since I have definitively abandoned the idea
due to performance issues. Looking backwards to the history, from Context
to Intermedia, now the solution has become 'UltraSearch' for which I personally
have to get acquainted about improvements.

  
  
resources, for my personal point of view this particular workflow did not scale well with existing systems having installed only the RDBMS with no spare capacity, specially in terms of CPU/Memory resources.

It scales enough, if you aren't trying to solve the googleproblem. :-)  For our users, it's ok performance, even for the heavyinternal users.  Just being able to search message boards and fileareas (including word docs) is huge plus for us.

Ihave tried IFS within a system doing well the RDBMS job for a lo-mid sized/tuned
configuration using Solaris 2.6 and Sun Sparc. IFS made to us the horrible
first impression of putting the system down in it's knees. Frankly, didn't
had the time/patiente to understand if there was a chance to tune-up a little
more and accomplish with the scalation, in my opinion it should have been
a waste of time for that particular situation without a real machine scalation.

  
  
Following your suggestion, I could drop the PDF textual contents achieved using pdftotext to a 'TEXT' datatype into a PostgreSQL, then use a search engine to look inside it to resemble a similar functionality regarding intermedia.



  Regarding the search engine, guess that it should be necessary to have at least a de-structurated text search algorithm along with something like SOUNDEX in Oracle.
  
  I don't think intermedia uses SOUNDEX.  It does pure keywordmatching.  It's particularly bad in my opinion.  It also doesn't learnwhat people really want to know.  For example, if you search:http://www.bivio.com/pub/search?s=taxesYou always get the IRS Pubs, but this is rarely what people arelooking for on our site (although they should read the publications,they are more interested in what bivio can do for them in terms oftaxes).  Note the performance on the search.  The data set you aresearching in the public case is very small in comparison to the wholedocument database which is multi-GB.Hope this helps.Rob
  
I'm not sure what intermedia uses to search text, certainly it don't learns
anything about searches (don't know what 'Ultrasearch' is capable of despite
all the hyphe Oracle is putting into this technology as usually) . Regarding
the search in bivio.com, it's quite okay in terms of human-awareness response
but probably it should do better thinking in terms of a 12 pages indexed
data set.
  


Thanks a lot for your valuable suggestions. I will let you know just in case
of further evolution from what we've talking about.

All the best.

Fabian.




General interest question: PDF contents handling in PostgreSQL.

2002-11-26 Thread Fabián R. Breschi
Dear Group,

   I wonder if using ModPerl and PostgreSQL there's any possibility to 
resemble what in Oracle is called 'Intermedia', in this particular case 
parsing/indexing content of PDF files inside PostgreSQL as  a LOB or 
alternatively as a flat OS file with metadata parsed/indexed from it 
into the RDBMS.

For what I can understand, this issue may involve directly PostgreSQL 
thought as having an analog functionality compared with Oracle 8i/9i so, 
as far as I know this feature is not implemented natively but probably 
could has been developed aside as a procedural object or similar.

Perhaps something exists in regards of ModPerl used along the RDBMS itself.

Any suggestion will be highly appreciated.

Many thanks indeed.

Fabian R. Breschi






Three tier computing: suggestion needed for DBI connection.

2002-08-26 Thread Fabián R. Breschi



Hello all,

At the moment I'm running Apache 1.3.12+mod_perl 1.27 with PG 7.2.1 via DBI-DBD
on a SS5 Solaris 2.6 machine.

I'd like to separate the database engine from the Apache environment so,
running Apache+mod_perl+DBI+DBD on the front end and PostgreSQL at the backend
(hope I'm right with this particular interpretation of 3tier and split of
modules...)

I have glanced around for DBI connect scenarios but could not find any useful
example.

My questions are:

- How do I setup my connection string from $dbh=DBI-connect('DBI:Pg:dbname=mydb','login','password')
to include in my 'dbname' the host name i.e. 'dbname=mydb@Ultra1' being Ultra1
a fully qualified alias into my hosts table,
- Providing the above is possible, I imagine that leaving PG installed at
the front end it could only be useful for 'psql -h Ultra1 mydb' but not necessarily
used for DBI?

Any suggestions are much appreciated.

Fabian.