Integrating Solr with Database
Hello, I'm currently working on file management system based on Solr. What I have accomplished now is that I have Solr server and windows client application that runs on different computers. When the client indexes rich document to Solr server remotely, it also uploads the file itself via FTP. So that when anyone searches for the document, he/she can download the raw file from server. What I want to do right now is that whenever the client indexes document and uploads the raw file, database gets update with the pairs of (Document ID in Solr, path of the raw file inside server). So on search result page, instead of giving the direct link of the raw file, I'd like to make server to look up the database based on the Document ID in Solr and return the linked file path. As I'm new to database, Apache, RESTful API, and stuff, I'm not sure how to begin implementing this feature. Any help or starting point would be appreciated. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Integrating-Solr-with-Database-tp4019692.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Integrating Solr with Database
This might make sense if you were using Solr to search for the ID of an object in the database with relations to other objects. However, if all you are doing is retrieving the file path/URL, why not index that into Solr, and get it directly from Solr? That's what I'm doing right now but since there are some naming and security issues, I'd like to integrate Solr with database eventually. If you still want to do what you had in mind, you should handle that as part of your indexing process, i.e., update both Solr and the database at the same time I have thought about that, but I could not figure out how to update database when I'm updating Solr. I'm pretty sure database has to be connected with Solr somehow (first difficulty) and database has to be updated remotely with Windows Form Application written in C# (second difficulty) Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Integrating-Solr-with-Database-tp4019692p4019695.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr UI for File Search
Hello, I'm almost done with my file (rich document) searching system for server and client side. Now I have to do is configure search result interface so that it displays result properly and attach a link to the searched files. (It just shows xml result now) I cannot simply use other application because I added my own file parsers on Tika. So what would be my best option in order to add nice UI to my system without messing with it? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UI-for-File-Search-tp4015476.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr adding header and footer to streamed documents
Hello, I'm trying to write custom parser and add it to Tika, but I'm not very successful right now. As I have a binary file that converts custom file type into XML file, I'm converting custom file to XML file inside my custom parser, then call XMLParser inside the parser. However, when I convert InputStream stream (inside parse function) to File, it seems that Solr is adding header and footer that contains Metadata so the file won't be converted properly. (http://wiki.apache.org/solr/ExtractingRequestHandler#Metadata) Following text is added as a header 1 000: 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2 010: 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 3139 --19 3 020: 3230 3862 3937 3764 6637 0d0a 436f 6e74 208b977df7..Cont 4 030: 656e 742d 4469 7370 6f73 6974 696f 6e3a ent-Disposition: 5 040: 2066 6f72 6d2d 6461 7461 3b20 6e61 6d65 form-data; name 6 050: 3d22 6d79 6669 6c65 223b 2066 696c 656e =myfile; filen 7 060: 616d 653d 2268 7770 322e 6877 7022 0d0a ame=hwp2.hwp.. 8 070: 436f 6e74 656e 742d 5479 7065 3a20 6170 Content-Type: ap 9 080: 706c 6963 6174 696f 6e2f 6f63 7465 742d plication/octet- 10 090: 7374 7265 616d 0d0a 0d0a d0cf 11e0 a1b1 stream Following text is added as a footer 554 0002290: 0d0a 2d2d 2d2d 555 00022a0: 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 556 00022b0: 2d2d 2d2d 2d2d 2d2d 2d2d 3139 3230 3862 --19208b 557 00022c0: 3937 3764 6637 2d2d 0d0a977df7--.. How can I prevent Solr from adding headers and footers? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-adding-header-and-footer-to-streamed-documents-tp4003439.html Sent from the Solr - User mailing list archive at Nabble.com.
Runtime.exec() not working on Tomcat
I have following code on my Apache Tika Maven project. This code works when I test locally, but fails when it's attached as external jar in Apache Solr (container is Tomcat). String cmd; contains command string that will convert file with input as ./convert.bin input.custom output.xml I checked that convert.bin and input.custom exists. String cmd; // As explained above File out = new File(dir_path, output.xml); // dir_path is file path Process ps = null; try { ps = Runtime.getRuntime().exec(cmd); // execute command int exitVal = ps.waitFor(); logger.info(Executing Runtime successful with exit value of + exitVal); // exitVal is 0 } catch (Exception e) { logger.error(Exception in executing Runtime: + e); // not reaching here } // I get Out file does not exist, although I should get the proper output if (out.exists()) logger.info(Out file exists]); else logger.info(Out file does not exist]); // reaches here out.setWritable(true); out.setReadable(true); out.setExecutable(true); out.deleteOnExit(); // I get FileNotFoundException here InputStream xml_stream = new FileInputStream(out); I'm really confused because I get the right result locally (Maven test), but not when it is on Tomcat. Any help please? -- View this message in context: http://lucene.472066.n3.nabble.com/Runtime-exec-not-working-on-Tomcat-tp4002614.html Sent from the Solr - User mailing list archive at Nabble.com.
Setting metadata while indexing custom file
Hello, I'd like to set Content-Type of the file while I'm using ExtractRequestHandler to pass file to Tika. As I'm indexing custom file type, it seems that Tika is not matching my file to the right custom parser. So I really need to explicitly declare Content-Type of my custom file so that it cannot miss the right parser. Until now, passing filename by resource.name variable is not working for me. How can I do this? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-metadata-while-indexing-custom-file-tp4000781.html Sent from the Solr - User mailing list archive at Nabble.com.