Re: Save the file sent to the ExtractingRequestHandler locally on the server.

2010-11-17 Thread Kaustuv Royburman
A possible solution is to use a directory on the server to upload the 
files. Monitor the directory for new uploads and then post the documents 
to the solr using curl.


If you are using a linux based server you can use inotifywatch to 
monitor the folder for new file uploads and then use the following curl 
command


curl 
"http://:/solr/update/extract?literal.id=&uprefix=attr_&fmap.content=content&literal.type=binaryfile&literal.url=<http://yourdomain/uploads_directory/documentname.ext>" 
-F "myfile=@"


curl http://:/solr/update --data-binary '' 
-H 'Content-type:text/xml; charset=utf-8'


Replace the following with actual values :
:
 : Actual name with extension
<http://yourdomain/uploads_directory/documentname.ext>




---
Regards,
Kaustuv Royburman

Senior Software Developer
infoservices.in
DLF IT Park,
Rajarhat, 1st Floor, Tower - 3
Major Arterial Road,
Kolkata - 700156,
India

On Thursday 18 November 2010 12:10 PM, Lance Norskog wrote:
Upload the files independently of Solr. Solr is not a content 
management system.
One problem is getting the links put together so that the link that 
comes out with the document can be turned into a link the user can open.


Chad Salamon wrote:

I would like to save files sent to the ExtractingRequestHandler on the
server processing it, and provide a link to the file in the solr
document. I currently am running a solr core as a part of a larger web
app, and I would like to publish the files as a part of that same web
app. This way, both solr and the files can be behind the same security
filters (Spring Security).

I can think of two ways to do this - one would be to extend
ExtractingRequestHandler to grab the files and then save them where I
want to. The other would be to upload the files independently of Solr
and then send them to the ExtractingRequestHandler through remote
streaming.

Any other suggestions would be appreciated. Thanks.




Solr server with utf-8 support in jetty

2010-11-17 Thread Kaustuv Royburman

I am running solr from the examples folder using the command
java -jar start.jar

When i run the test_utf8.sh file from the exampledocs folder I get the 
following output


Solr server is up.
HTTP GET is accepting UTF-8
HTTP POST is accepting UTF-8
HTTP POST defaults to UTF-8
ERROR: HTTP GET is not accepting UTF-8 beyond the basic multilingual plane
ERROR: HTTP POST is not accepting UTF-8 beyond the basic multilingual plane
ERROR: HTTP POST + URL params is not accepting UTF-8 beyond the basic 
multilingual plane



Can someone provide me some pointers as to how to configure jetty so as 
to accept UTF-8 beyond the basic multilingual plane ?

I would be using jetty to run solr.



--
---
Regards,
Kaustuv Royburman

Senior Software Developer
infoservices.in
DLF IT Park,
Rajarhat, 1st Floor, Tower - 3
Major Arterial Road,
Kolkata - 700156,
India


Error while indexing files with Solr

2010-11-11 Thread Kaustuv Royburman

Hi,
I am trying to index documents (PDF, Doc, XLS, RTF) using the 
ExtractingRequestHandler.


I am following the tutorial at 
http://wiki.apache.org/solr/ExtractingRequestHandler

But when i run the following command

*curl 
"http://localhost:8983/solr/update/extract?literal.id=mydoc.doc&uprefix=attr_&fmap.content=attr_content"; 
-F "myfile=@/home/system/Documents/mydoc.doc"*


i am getting the following error :




Error 500 

HTTP ERROR: 500lazy loading error

org.apache.solr.common.SolrException: lazy loading error
   at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249)
   at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
   at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
   at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
   at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)

   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
   at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
   at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)

   at org.mortbay.jetty.Server.handle(Server.java:285)
   at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
   at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)

   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
   at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
   at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: org.apache.solr.common.SolrException: Error loading class 
'org.apache.solr.handler.extraction.ExtractingRequestHandler'
   at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)

   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
   at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)

   ...21 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.solr.handler.extraction.ExtractingRequestHandler not found in 
java.net.URLClassLoader{urls=[], parent=contextloa...@null}

   at java.net.URLClassLoader.findClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.Class.forName(libgcj.so.90)
   at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)

   ...24 more

RequestURI=/solr/update/extracthref="http://jetty.mortbay.org/";>Powered by 
Jetty://

























I am running Debian Lenny and java version "1.6.0_22".
I am running apache-solr-1.4.1 and running it from the examples directory.

Please point me in the right direction and help me solve the problem.



--
---
Regards,
Kaustuv Royburman

Senior Software Developer
infoservices.in
DLF IT Park,
Rajarhat, 1st Floor, Tower - 3
Major Arterial Road,
Kolkata - 700156,
India