Re: Geo spatial clustering of points

2013-08-23 Thread David Smiley (@MITRE.org)
Hi Chris  Jeroen,

Tonight I posted some tips on Solr's wiki on this subject:
http://wiki.apache.org/solr/SpatialClustering

~ David


Chris Atkinson wrote
 Did you get any resolution for this? I'm about to implement something
 identical.
 On 3 Jul 2013 23:03, Jeroen Steggink lt;

 jeroen@

 gt; wrote:
 
 Hi,

 I'm looking for a way to clustering (or should I call it group) geo
 spatial points on map based on the current zoom level and get the median
 coordinate for that cluster.
 Let's say I'm on the world level, and I want to cluster spatial points
 within a 1000 km radius. When I zoom in I only want to get the clustered
 points for that boundary. Let's say all the points within the US and
 cluster them within a 500 km radius.

 I'm using Solr 4.3.0 and looked into
 SpatialRecursivePrefixTreeFiel**dType
 with faceting. However, I'm not sure if the geohashes are of any use for
 clustering points.

 Does anyone have any experience with geo spatial clustering with Solr?

 Regards,

 jeroen








-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Geo-spatial-clustering-of-points-tp4075315p4086243.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to integrate solr with HDFS HA

2013-08-23 Thread YouPeng Yang
Hi all
I try to integrate solr with HDFS HA.When I start the solr server, it
comes out an exeception[1].
And I do know this is because the hadoop.conf.Configuration  in
HdfsDirectoryFactory.java does not include the HA configuration.
So I want to know ,in solr,is there any way to include my hadoop  HA
configuration ?


[1]---
Caused by: java.lang.IllegalArgumentException:
java.net.UnknownHostException: lklcluster
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:415)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:382)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:123)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2277)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2311)
at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2299)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:364)
at
org.apache.solr.store.hdfs.HdfsDirectory.init(HdfsDirectory.java:59)
at
org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:154)
at
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:350)
at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:256)
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:469)
at org.apache.solr.core.SolrCore.init(SolrCore.java:759)


Reloading synonyms and stop words

2013-08-23 Thread Bruno René Santos
Hello,

Is it possible to reload the synonyms and stopwords files without rebooting
solr?

Regards
Bruno Santos

-- 
Bruno René Santos
Lisboa - Portugal


Re: Reloading synonyms and stop words

2013-08-23 Thread Shalin Shekhar Mangar
Yes, you can use the Core RELOAD command:

https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D

On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote:
 Hello,

 Is it possible to reload the synonyms and stopwords files without rebooting
 solr?

 Regards
 Bruno Santos

 --
 Bruno René Santos
 Lisboa - Portugal



-- 
Regards,
Shalin Shekhar Mangar.


Re: Flushing cache without restarting everything?

2013-08-23 Thread Toke Eskildsen
On Thu, 2013-08-22 at 20:08 +0200, Walter Underwood wrote:
 We warm the file buffers before starting Solr to avoid spending time
 waiting for disk IO. The script is something like this:
 
 for core in core1 core2 core3
 do
 find /apps/solr/data/${core}/index -type f | xargs cat  /dev/null
 done

On that subject, I will note that the reason for not just doing a simple
cp index_files  /dev/null is that the shell (sometimes? always?) is
clever enough to skip the copying when it is done directly to /dev/null.

 It makes a big difference in the first few minutes of service. Of
 course, it helps if you have enough RAM to hold the entire index.

That is actually essential if you want to perform a reproducible test.
If there is not enough free RAM for the full amount of index files, the
first files from find will not be cached at all and if the index has
changed since last test, that makes it pretty arbitrary which parts are
cached and which are not.
-- 
Ceterum censeo spinning drives esse delendam



Re: when does RAMBufferSize work when commit.

2013-08-23 Thread YouPeng Yang
Hi Shawn

Thanks a lot. I got it.

Regards


2013/8/22 Shawn Heisey s...@elyograg.org

 On 8/22/2013 2:25 AM, YouPeng Yang wrote:
  Hi all
  About the RAMBufferSize  and commit ,I have read the doc :
  http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/60544
 
 I can not figure out how do they make work.
 
Given the settings:
 
   ramBufferSizeMB10/ramBufferSizeMB
   autoCommit
 maxTime${solr.autoCommit.maxDocs:1000}/maxTime
 openSearcherfalse/openSearcher
   /autoCommit
 
   If the indexs docs up to 1000  and the size of these docs is below 10MB
  ,it will trigger an commit.
 
   If the size of the indexed docs reaches to 10MB while the the number is
 below
  1000, it will not trigger an commit , however the index docs will just
  be flushed
  to disk,it will only commit when the number reaches to 1000?

 Your actual config seems to have its wires crossed a little bit.  You
 have the autoCommit.maxDocs value being used in a maxTime tag, not a
 maxDocs tag.  You may want to adjust the variable name or the tag.

 If that were a maxDocs tag instead of maxTime, your description would be
 pretty much right on the money.  The space taken in the RAM buffer is
 typically larger than the actual document size, but the general idea is
 sound.

 The default for RAMBufferSizeMB in recent Solr versions is 100.  Unless
 you've got super small documents, or you are in a limited memory
 situation and have a lot of cores, I would not go smaller than that.

 Thanks,
 Shawn




Leader election

2013-08-23 Thread Srivatsan
Hi,

I am using solr 4.4 for my search application. I was indexing some 1 million
docs. At that time, i accidentally killed leader node of that collection.
Indexing failed with the exception ,

/org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this
request:[http://localhost:9133/solr/collection7_shard1_replica3,
http://localhost:8983/solr/collection7_shard1_replica1]
at
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:333)
at
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:318)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)/

after that i checked solr admin page, leader election didnt get triggered
for that collection. 

http://lucene.472066.n3.nabble.com/file/n4086259/Screenshot.png 

I couldnt able to index for that collection but i can able to search from
that collection.

Help me in this issue

Thanks in advance

Srivatsan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Leader-election-tp4086259.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Reloading synonyms and stop words

2013-08-23 Thread Bruno René Santos
Great! What about inside a RequestHandler source code in Java? I want to
create a requestHandler that receives new synonyms, insert them on the
synonyms file and reload the core.

Regards
Bruno


On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Yes, you can use the Core RELOAD command:


 https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D

 On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com
 wrote:
  Hello,
 
  Is it possible to reload the synonyms and stopwords files without
 rebooting
  solr?
 
  Regards
  Bruno Santos
 
  --
  Bruno René Santos
  Lisboa - Portugal



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Bruno René Santos
Lisboa - Portugal


Re: Solr Ref guide question

2013-08-23 Thread Yago Riveiro
The version is the 4.4, 

I did the download, unzip and run the command in example folder: java -jar 
start.jar 

Is a fresh install, no modification done.

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Thursday, August 22, 2013 at 9:22 PM, Brendan Grainger wrote:

 What version of solr are you using? Have you copied a solr.xml from
 somewhere else? I can almost reproduce the error you're getting if I put a
 non-existent core in my solr.xml, e.g.:
 
 solr
 
 cores adminPath=/admin/cores
 core name=core0 instanceDir=a_non_existent_core /
 /cores
 ...
 
 
 On Thu, Aug 22, 2013 at 1:30 PM, yriveiro yago.rive...@gmail.com 
 (mailto:yago.rive...@gmail.com) wrote:
 
  Hi all,
  
  I think that there is some lack in solr's ref doc.
  
  Section Running Solr says to run solr using the command:
  
  $ java -jar start.jar
  
  But If I do this with a fresh install, I have a stack trace like this:
  http://pastebin.com/5YRRccTx
  
  Is it this behavior as expected?
  
  
  
  -
  Best regards
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/Solr-Ref-guide-question-tp4086142.html
  Sent from the Solr - User mailing list archive at Nabble.com 
  (http://Nabble.com).
  
 
 
 
 
 -- 
 Brendan Grainger
 www.kuripai.com (http://www.kuripai.com)
 
 




Indexing status when one tomcat goes down

2013-08-23 Thread Prasi S
hi all,
Im running solr cloud with solr 4.4 . I have 2 tomcat instances with 4
shards ( 2 in each).

What will happen if one of the tomcats go down during indexing. The otehr
tomcat throws status as  Leader not active in the logs.

Regards,
Prasi


Re: Solr Ref guide question

2013-08-23 Thread Yago Riveiro
I found the problem.  

The java version was overwritten for a dependency and was 1.5.

Reinstalling java, Solr works as expected.

Tal vez si fuese lanzado un error diciendo que la versión no era compatible 
ayudaba en estos casos.  

--  
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, August 23, 2013 at 10:30 AM, Yago Riveiro wrote:

 The version is the 4.4,  
  
 I did the download, unzip and run the command in example folder: java -jar 
 start.jar  
  
 Is a fresh install, no modification done.
  
 --  
 Yago Riveiro
 Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
  
  
 On Thursday, August 22, 2013 at 9:22 PM, Brendan Grainger wrote:
  
  What version of solr are you using? Have you copied a solr.xml from
  somewhere else? I can almost reproduce the error you're getting if I put a
  non-existent core in my solr.xml, e.g.:
   
  solr
   
  cores adminPath=/admin/cores
  core name=core0 instanceDir=a_non_existent_core /
  /cores
  ...
   
   
  On Thu, Aug 22, 2013 at 1:30 PM, yriveiro yago.rive...@gmail.com 
  (mailto:yago.rive...@gmail.com) wrote:
   
   Hi all,

   I think that there is some lack in solr's ref doc.

   Section Running Solr says to run solr using the command:

   $ java -jar start.jar

   But If I do this with a fresh install, I have a stack trace like this:
   http://pastebin.com/5YRRccTx

   Is it this behavior as expected?



   -
   Best regards
   --
   View this message in context:
   http://lucene.472066.n3.nabble.com/Solr-Ref-guide-question-tp4086142.html
   Sent from the Solr - User mailing list archive at Nabble.com 
   (http://Nabble.com).

   
   
   
   
  --  
  Brendan Grainger
  www.kuripai.com (http://www.kuripai.com)
   
   
   
  
  



Re: Solr Ref guide question

2013-08-23 Thread Yago Riveiro
Ups, sorry for the las email, wrong language :P  

I wanted to say was:

Maybe if it were thrown an error saying that the version was not compatible 
helped in these cases  

--  
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, August 23, 2013 at 10:53 AM, Yago Riveiro wrote:

 I found the problem.  
  
 The java version was overwritten for a dependency and was 1.5.
  
 Reinstalling java, Solr works as expected.
  
 Tal vez si fuese lanzado un error diciendo que la versión no era compatible 
 ayudaba en estos casos.  
  
 --  
 Yago Riveiro
 Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
  
  
 On Friday, August 23, 2013 at 10:30 AM, Yago Riveiro wrote:
  
  The version is the 4.4,  
   
  I did the download, unzip and run the command in example folder: java -jar 
  start.jar  
   
  Is a fresh install, no modification done.
   
  --  
  Yago Riveiro
  Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
   
   
  On Thursday, August 22, 2013 at 9:22 PM, Brendan Grainger wrote:
   
   What version of solr are you using? Have you copied a solr.xml from
   somewhere else? I can almost reproduce the error you're getting if I put a
   non-existent core in my solr.xml, e.g.:

   solr

   cores adminPath=/admin/cores
   core name=core0 instanceDir=a_non_existent_core /
   /cores
   ...


   On Thu, Aug 22, 2013 at 1:30 PM, yriveiro yago.rive...@gmail.com 
   (mailto:yago.rive...@gmail.com) wrote:

Hi all,
 
I think that there is some lack in solr's ref doc.
 
Section Running Solr says to run solr using the command:
 
$ java -jar start.jar
 
But If I do this with a fresh install, I have a stack trace like this:
http://pastebin.com/5YRRccTx
 
Is it this behavior as expected?
 
 
 
-
Best regards
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Ref-guide-question-tp4086142.html
Sent from the Solr - User mailing list archive at Nabble.com 
(http://Nabble.com).
 




   --  
   Brendan Grainger
   www.kuripai.com (http://www.kuripai.com)



   
   
  



Re: Reloading synonyms and stop words

2013-08-23 Thread Shalin Shekhar Mangar
I don't think that should be a problem. Your custom RequestHandler
must call reload. Note that a new instance of your request handler
will be created and inform will be called on it once reload happens
i.e. you won't be able to keep any state in the request handler across
core reloads.

You can also do this at a level above RequestHandler i.e. via a custom
CoreAdminHandler. See CoreAdminHandler.handleCustomAction()

On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com wrote:
 Great! What about inside a RequestHandler source code in Java? I want to
 create a requestHandler that receives new synonyms, insert them on the
 synonyms file and reload the core.

 Regards
 Bruno


 On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 Yes, you can use the Core RELOAD command:


 https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D

 On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com
 wrote:
  Hello,
 
  Is it possible to reload the synonyms and stopwords files without
 rebooting
  solr?
 
  Regards
  Bruno Santos
 
  --
  Bruno René Santos
  Lisboa - Portugal



 --
 Regards,
 Shalin Shekhar Mangar.




 --
 Bruno René Santos
 Lisboa - Portugal



-- 
Regards,
Shalin Shekhar Mangar.


Re: Leader election

2013-08-23 Thread Shalin Shekhar Mangar
How long has the shard been without a leader? How many shards? How
many replicas per shard? Which version of Solr?

On Fri, Aug 23, 2013 at 2:51 PM, Srivatsan ranjith.venkate...@gmail.com wrote:
 Hi,

 I am using solr 4.4 for my search application. I was indexing some 1 million
 docs. At that time, i accidentally killed leader node of that collection.
 Indexing failed with the exception ,

 /org.apache.solr.client.solrj.SolrServerException: No live SolrServers
 available to handle this
 request:[http://localhost:9133/solr/collection7_shard1_replica3,
 http://localhost:8983/solr/collection7_shard1_replica1]
 at
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:333)
 at
 org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:318)
 at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)/

 after that i checked solr admin page, leader election didnt get triggered
 for that collection.

 http://lucene.472066.n3.nabble.com/file/n4086259/Screenshot.png

 I couldnt able to index for that collection but i can able to search from
 that collection.

 Help me in this issue

 Thanks in advance

 Srivatsan



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Leader-election-tp4086259.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Regards,
Shalin Shekhar Mangar.


Re: Leader election

2013-08-23 Thread Srivatsan
almost 15 minutes. After that i restarted the entire cluster. I am using solr
4.4 with 1 shard and 3 replicas



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Leader-election-tp4086259p4086287.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Leader election

2013-08-23 Thread Shalin Shekhar Mangar
Any exceptions in the logs of other replicas. The default
leaderVoteWait time is 3 minutes after which a leader election should
have been initiated automatically.

On Fri, Aug 23, 2013 at 4:01 PM, Srivatsan ranjith.venkate...@gmail.com wrote:
 almost 15 minutes. After that i restarted the entire cluster. I am using solr
 4.4 with 1 shard and 3 replicas



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Leader-election-tp4086259p4086287.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Regards,
Shalin Shekhar Mangar.


Re: Leader election

2013-08-23 Thread Srivatsan
No exceptions. And leaderVoteWait value will be used only during startup rite
? A new leader will be elected once the leader node is down. Am i right ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Leader-election-tp4086259p4086290.html
Sent from the Solr - User mailing list archive at Nabble.com.


Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying

2013-08-23 Thread aniljayanti
Hi,

I am working on solr 4.4 jetty, and generated the index on 3350128
records. Now i want to test the query performance. So applied load test with
time of 5 minutes, and 600 virtual users for different solr queries. After
test completion got below errors.

ERROR - 2013-08-23 09:49:43.867; org.apache.solr.common.SolrException;
null:org.eclipse.jetty.io.EofException
at 
org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:914)
at
org.eclipse.jetty.http.AbstractGenerator.blockForOutput(AbstractGenerator.java:507)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:170)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:190)
at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
at org.apache.solr.util.FastWriter.write(FastWriter.java:126)
at java.io.Writer.write(Writer.java:140)
at org.apache.solr.response.XMLWriter.startTag(XMLWriter.java:144)
at org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:347)
at org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:295)
at org.apache.solr.schema.StrField.write(StrField.java:67)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:130)
at org.apache.solr.response.XMLWriter.writeArray(XMLWriter.java:273)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:190)
at 
org.apache.solr.response.XMLWriter.writeSolrDocument(XMLWriter.java:199)
at
org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:275)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:172)
at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:111)
at
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:39)
at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:647)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:375)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketException: Connection reset by peer: socket write
error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at 

Problem with importing tab-delimited csv file

2013-08-23 Thread Rob Koeling Ai
I'm having trouble importing a tab-delimited file with the csv update handler.My data file looks like this:"id"	"question"	"answer"	"url""q99"	"Who?"	"You!"	"none"When I send this data to Solr using Curl:curl 'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' --data @sample.tmpAll seems to be well:?xml version="1.0" encoding="UTF-8"?responselst name="responseHeader"int name="status"0/intint name="QTime"221/int/lst/responseBut when I query the development core, there is no data. I must be overlooking something trivial. I would appreciate if anyone could spot what! - Rob




How to patch Solr4.2 for SolrEnityProcessor Sub-Enity issue

2013-08-23 Thread harshchawla
According to
http://stackoverflow.com/questions/15734308/solrentityprocessor-is-called-only-once-for-sub-entities?lq=1
we can use the patched SolrEntityProcessor in
https://issues.apache.org/jira/browse/SOLR-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel
to solve the subentity problem.

I tried renaming the jar file to zip and then try replacing the patched file
but as I got only java file I can't replace it with class file. So I drop
this idea.

Here is then what I have tried. I decompiled the original jar
solr-dataimporthandler-4.2.0.jar present in solr 4.2 package. Then I replace
the patch file. And try to compile the files for making the jar again. But I
started getting compilation errors.

.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: ')'
expected

/* 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: expected
/ 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: not a
statem ent / 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: illegal
star t of expression / 432 */ if
(XPathEntityProcessor.2.this.val$isEnd.get()) { ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: ';'
expected

/* 432 */ if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: ';'
expected

/* 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:398: not a
statem ent / 433 */ XPathEntityProcessor.2.this.val$throwExp.set(false); ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:398: ';'
expected

/* 433 / XPathEntityProcessor.2.this.val$throwExp.set(false); ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:406: not a
statem ent / 442 */ XPathEntityProcessor.2.this.val$isEnd.set(true); ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:406: ';'
expected

/* 442 / XPathEntityProcessor.2.this.val$isEnd.set(true); ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:409: not a
statem ent / 445 */ XPathEntityProcessor.2.this.offer(row); ^
.\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:409: ';'
expected

/* 445 */ XPathEntityProcessor.2.this.offer(row); ^ 12 errors

Any idea how to patch Solr4.2 for this issue.

I thought it must be in Sol4.4 but it is not in it. Any help on this. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-patch-Solr4-2-for-SolrEnityProcessor-Sub-Enity-issue-tp4086292.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR Prevent solr of modifying fields when update doc

2013-08-23 Thread Erick Erickson
Well, not much in the way of help because you can't do what you
want AFAIK. I don't think UUID is suitable for your use-case. Why not
use your uniqueId?

Or generate something yourself...

Best
Erick


On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com
 wrote:

 Hi,

 How can i prevent solr from update some fields when updating a doc?
 The problem is, i have an uuid with the field name uuid, but it is not an
 unique key. When a rss source updates a feed, solr will update the doc with
 the same link but it generates a new uuid. This is not the desired because
 this id is used by me to relate feeds with an user.

 Can someone help me?

 Many Thanks


Re: Storing query results

2013-08-23 Thread Erick Erickson
No, there's nothing like that in Solr. The closest you
could come would be to not do a hard commit (openSearcher=true)
or a soft commit for a very long time. As long as neither
of these things happen, the search results won't
change. But that's a hackish solution.

In fact I question your basic assumption. You say you don't
want the search results to change. But
1 the user probably wouldn't notice
2 this can mislead in completely different ways. What if
 most of the search results were deleted after the
 first query? What if the _exact_ document she was
 looking for got indexed after the first query?

This is one of those features that at first blush sounds
somewhat reasonable, but I don't think stands up under
inspection. It'd be some amount of work for, IMO, dubious
utility.

If you _must_ do something like this, the app layer could
do something like request a rows=1000fl=id
and essentially re-implement the queryResultCache at the
app layer. Subsequent pages would cause you to issue
queries like id=(1 or 54 or 90 ).

Best
Erick


On Thu, Aug 22, 2013 at 6:00 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi jfeist,

 Your mail reminds me this blog, not sure about solr though.


 http://blog.mikemccandless.com/2011/11/searcherlifetimemanager-prevents-broken.html



 
  From: jfeist jfe...@llminc.com
 To: solr-user@lucene.apache.org
 Sent: Friday, August 23, 2013 12:09 AM
 Subject: Storing query results


 I am in the process of setting up a search application that allows the user
 to view paginated query results.  The documents are highly dynamic but I
 want the search results to be static, i.e. I don't want the user to click
 the next page button, the query reruns, and now he has a different set of
 search results because the data changed while he was looking through it.  I
 want the results stored somewhere else and the successive page queries to
 draw from that.  I know Solr has query result caching, but I want to store
 it entirely.  Does Solr provide any functionality like this?  I imagine it
 doesn't, because then you'd need to specify how long to store it, etc.  I'm
 using Solr 4.4.0.  I found someone asking something similar  here
 http://lucene.472066.n3.nabble.com/storing-results-td476351.html   but
 that was 6 years ago.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Storing-query-results-tp4086182.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: SOLR Prevent solr of modifying fields when update doc

2013-08-23 Thread Luís Portela Afonso
Hi thanks by the answer, but the uniqueId is generated by me. But when solr 
indexes and there is an update in a doc, it deletes the doc and creates a new 
one, so it generates a new UUID.
It is not suitable for me, because i want that solr just updates some fields, 
because the UUID is the key that i use to map it to an user in my database.

Right now i'm using information that comes from the source and never chages, as 
my uniqueId, like for example the guid, that exists in some rss feeds, or if it 
doesn't exists i use link.

I think there is any simple solution for me, because for what i have read, when 
an update to a doc exists, SOLR deletes the old one and create a new one, right?

On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote:

 Well, not much in the way of help because you can't do what you
 want AFAIK. I don't think UUID is suitable for your use-case. Why not
 use your uniqueId?
 
 Or generate something yourself...
 
 Best
 Erick
 
 
 On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com
 wrote:
 
 Hi,
 
 How can i prevent solr from update some fields when updating a doc?
 The problem is, i have an uuid with the field name uuid, but it is not an
 unique key. When a rss source updates a feed, solr will update the doc with
 the same link but it generates a new uuid. This is not the desired because
 this id is used by me to relate feeds with an user.
 
 Can someone help me?
 
 Many Thanks



smime.p7s
Description: S/MIME cryptographic signature


Re: Measuring SOLR performance

2013-08-23 Thread Dmitry Kan
Hi Roman,

With adminPath=/admin or adminPath=/admin/cores, no. Interestingly
enough, though, I can access
http://localhost:8983/solr/statements/admin/system

But I can access http://localhost:8983/solr/admin/cores, only when with
adminPath=/admin/cores (which suggests that this is the right value to be
used for cores), and not with adminPath=/admin.

Bottom line, these core configuration is not self-evident.

Dmitry




On Fri, Aug 23, 2013 at 4:18 AM, Roman Chyla roman.ch...@gmail.com wrote:

 Hi Dmitry,
 So it seems solrjmeter should not assume the adminPath - and perhaps needs
 to be passed as an argument. When you set the adminPath, are you able to
 access localhost:8983/solr/statements/admin/cores ?

 roman


 On Wed, Aug 21, 2013 at 7:36 AM, Dmitry Kan solrexp...@gmail.com wrote:

  Hi Roman,
 
  I have noticed a difference with different solr.xml config contents. It
 is
  probably legit, but thought to let you know (tests run on fresh checkout
 as
  of today).
 
  As mentioned before, I have two cores configured in solr.xml. If the file
  is:
 
  [code]
  solr persistent=false
 
!--
adminPath: RequestHandler path to manage cores.
  If 'null' (or absent), cores will not be manageable via request
 handler
--
cores adminPath=/admin/cores host=${host:}
  hostPort=${jetty.port:8983} hostContext=${hostContext:solr}
  core name=metadata instanceDir=metadata /
  core name=statements instanceDir=statements /
/cores
  /solr
  [/code]
 
  then the instruction:
 
  python solrjmeter.py -a -x ./jmx/SolrQueryTest.jmx -q
  ./queries/demo/demo.queries -s localhost -p 8983 -a --durationInSecs 60
 -R
  cms -t /solr/statements -e statements -U 100
 
  works just fine. If however the solr.xml has adminPath set to /admin
  solrjmeter produces an error:
 
  [error]
  **ERROR**
File solrjmeter.py, line 1386, in module
  main(sys.argv)
File solrjmeter.py, line 1278, in main
  check_prerequisities(options)
File solrjmeter.py, line 375, in check_prerequisities
  error('Cannot find admin pages: %s, please report a bug' % apath)
File solrjmeter.py, line 66, in error
  traceback.print_stack()
  Cannot find admin pages: http://localhost:8983/solr/admin, please
 report a
  bug
  [/error]
 
  With both solr.xml configs the following url returns just fine:
 
  http://localhost:8983/solr/statements/admin/system?wt=json
 
  Regards,
 
  Dmitry
 
 
 
  On Wed, Aug 14, 2013 at 2:03 PM, Dmitry Kan solrexp...@gmail.com
 wrote:
 
   Hi Roman,
  
   This looks much better, thanks! The ordinary non-comarison mode works.
   I'll post here, if there are other findings.
  
   Thanks for quick turnarounds,
  
   Dmitry
  
  
   On Wed, Aug 14, 2013 at 1:32 AM, Roman Chyla roman.ch...@gmail.com
  wrote:
  
   Hi Dmitry, oh yes, late night fixes... :) The latest commit should
 make
  it
   work for you.
   Thanks!
  
   roman
  
  
   On Tue, Aug 13, 2013 at 3:37 AM, Dmitry Kan solrexp...@gmail.com
  wrote:
  
Hi Roman,
   
Something bad happened in fresh checkout:
   
python solrjmeter.py -a -x ./jmx/SolrQueryTest.jmx -q
./queries/demo/demo.queries -s localhost -p 8983 -a --durationInSecs
  60
   -R
cms -t /solr/statements -e statements -U 100
   
Traceback (most recent call last):
  File solrjmeter.py, line 1392, in module
main(sys.argv)
  File solrjmeter.py, line 1347, in main
save_into_file('before-test.json',
 simplejson.dumps(before_test))
  File /usr/lib/python2.7/dist-packages/simplejson/__init__.py,
 line
   286,
in dumps
return _default_encoder.encode(obj)
  File /usr/lib/python2.7/dist-packages/simplejson/encoder.py,
 line
   226,
in encode
chunks = self.iterencode(o, _one_shot=True)
  File /usr/lib/python2.7/dist-packages/simplejson/encoder.py,
 line
   296,
in iterencode
return _iterencode(o, 0)
  File /usr/lib/python2.7/dist-packages/simplejson/encoder.py,
 line
   202,
in default
raise TypeError(repr(o) +  is not JSON serializable)
TypeError: __main__.ForgivingValue object at 0x7fc6d4040fd0 is not
   JSON
serializable
   
   
Regards,
   
D.
   
   
On Tue, Aug 13, 2013 at 8:10 AM, Roman Chyla roman.ch...@gmail.com
 
wrote:
   
 Hi Dmitry,



 On Mon, Aug 12, 2013 at 9:36 AM, Dmitry Kan solrexp...@gmail.com
 
wrote:

  Hi Roman,
 
  Good point. I managed to run the command with -C and double
  quotes:
 
  python solrjmeter.py -a -C g1,cms -c hour -x
   ./jmx/SolrQueryTest.jmx
 
  As a result got several files (html, css, js, csv) in the
 running
 directory
  (any way to specify where the output should be stored in this
  case?)
 

 i know it is confusing, i plan to change it - but later, now it is
  too
busy
 here...


 
  When I look onto the comparison dashboard, I see this:
 
  http://pbrd.co/17IRI0b
 

 two 

Re: Flushing cache without restarting everything?

2013-08-23 Thread Dmitry Kan
by monitoring the original and changed systems over long enough periods,
where long enough is a parameter (to compute).
Or then going really low-level, if you know which component has been
changed (like they do in Lucene [1]; not always possible in Solr..)

[1] http://people.apache.org/~mikemccand/lucenebench/



On Thu, Aug 22, 2013 at 3:58 PM, Jean-Sebastien Vachon 
jean-sebastien.vac...@wantedanalytics.com wrote:

 How can you validate that the changes you just made had any impact on the
 performance of the cloud if you don't have the same starting conditions?

 What we do basically is running a batch of requests to warm up the index
 and then launch the benchmark itself. That way we can measure the impact of
 our change(s). Otherwise there is absolutely no way we can be sure who is
 responsible for the gain or loss of performance.

 Restarting a cloud is actually a real pain, I just want to know if there
 is a faster way to proceed.

  -Original Message-
  From: Dmitry Kan [mailto:solrexp...@gmail.com]
  Sent: August-22-13 7:26 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Flushing cache without restarting everything?
 
  But is it really a good benchmarking, if you flush the cache? Wouldn't
 you
  want to benchmark against a system, that would be comparable to what is
  under real (=production) load?
 
  Dmitry
 
 
  On Tue, Aug 20, 2013 at 9:39 PM, Jean-Sebastien Vachon  jean-
  sebastien.vac...@wantedanalytics.com wrote:
 
   I just want to run benchmarks and want to have the same starting
   conditions.
  
-Original Message-
From: Walter Underwood [mailto:wun...@wunderwood.org]
Sent: August-20-13 2:06 PM
To: solr-user@lucene.apache.org
Subject: Re: Flushing cache without restarting everything?
   
Why? What are you trying to acheive with this? --wunder
   
On Aug 20, 2013, at 11:04 AM, Jean-Sebastien Vachon wrote:
   
 Hi All,

 Is there a way to flush the cache of all nodes in a Solr Cloud (by
   reloading all
the cores, through the collection API, ...) without having to
restart
   all nodes?

 Thanks
   
   
   
   
   
-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3392 / Base de données virale: 3209/6563 - Date:
   09/08/2013
La Base de données des virus a expiré.
  
 
  -
  Aucun virus trouvé dans ce message.
  Analyse effectuée par AVG - www.avg.fr
  Version: 2013.0.3392 / Base de données virale: 3209/6563 - Date:
 09/08/2013
  La Base de données des virus a expiré.



Query term count over a result set

2013-08-23 Thread JZ
Hi all,

I would like to get the total count of a query term of a result set. Is
there a way to get this?

I know there is a TermVectorComponent that does this per result (document),
but it would be far too expensive to take the sum over all documents for a
term given that term.

The LukeRequestHandler and the terms component only present the term counts
over the whole index.

Thanks!


RE: How to set discountOverlaps=true in Solr 4x schema.xml

2013-08-23 Thread Markus Jelsma
Yes, discountOverlaps is used in computeNorm which is used at index time. You 
should see a change after reindexing.

Cheers,
Markus
 
-Original message-
 From:Tom Burton-West tburt...@umich.edu
 Sent: Thursday 22nd August 2013 23:32
 To: solr-user@lucene.apache.org
 Subject: Re: How to set discountOverlaps=quot;truequot; in Solr 4x 
 schema.xml
 
 I should have said that I have set it both to true and to false and
 restarted Solr each time and the rankings and info in the debug query
 showed no change.
 
 Does this have to be set at index time?
 
 Tom
 
 
 
 
 


Re: dataimporter tika fields empty

2013-08-23 Thread Andreas Owen
ok but i'm not doing any path extraction, at least i don't think so.

htmlMapper=identity isn't preserving html

it's reading the content of the pages but it's not putting it into text_test 
and text. it's only in text_test the copyField isn't working. 

data-config.xml:

dataConfig
dataSource type=BinFileDataSource name=data/
dataSource type=BinURLDataSource name=dataUrl/
dataSource type=URLDataSource name=main/
document
entity name=rec processor=XPathEntityProcessor 
url=http://127.0.0.1/tkb/internet/docImportUrl.xml; forEach=/docs/doc 
dataSource=main 
field column=title xpath=//title /
field column=id xpath=//id /
field column=file xpath=//file /
field column=path xpath=//path /
field column=url xpath=//url /
field column=Author xpath=//author /

entity name=tika processor=TikaEntityProcessor 
url=${rec.path}${rec.file} dataSource=dataUrl onError=skip 
htmlMapper=identity 
field column=text name=text_test /
copyField source=text_test dest=text /
!-- field column=text_test 
xpath=//div[@id='content'] /  --
/entity
/entity
/document
/dataConfig


On 22. Aug 2013, at 10:06 PM, Alexandre Rafalovitch wrote:

 Ah. That's because Tika processor does not support path extraction. You
 need to nest one more level.
 
 Regards,
  Alex
 On 22 Aug 2013 13:34, Andreas Owen a...@conx.ch wrote:
 
 i can do it like this but then the content isn't copied to text. it's just
 in text_test
 
 entity name=tika processor=TikaEntityProcessor
 url=${rec.path}${rec.file} dataSource=dataUrl 
field column=text name=text_test
copyField source=text_test dest=text /
 /entity
 
 
 On 22. Aug 2013, at 6:12 PM, Andreas Owen wrote:
 
 i put it in the tika-entity as attribute, but it doesn't change
 anything. my bigger concern is why text_test isn't populated at all
 
 On 22. Aug 2013, at 5:27 PM, Alexandre Rafalovitch wrote:
 
 Can you try SOLR-4530 switch:
 https://issues.apache.org/jira/browse/SOLR-4530
 
 Specifically, setting htmlMapper=identity on the entity definition.
 This
 will tell Tika to send full HTML rather than a seriously stripped one.
 
 Regards,
 Alex.
 
 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)
 
 
 On Thu, Aug 22, 2013 at 11:02 AM, Andreas Owen a...@conx.ch wrote:
 
 i'm trying to index a html page and only user the div with the
 id=content. unfortunately nothing is working within the tika-entity,
 only
 the standard text (content) is populated.
 
  do i have to use copyField for test_text to get the data?
  or is there a problem with the entity-hirarchy?
  or is the xpath wrong, even though i've tried it without and just
 using text?
  or should i use the updateextractor?
 
 data-config.xml:
 
 dataConfig
  dataSource type=BinFileDataSource name=data/
  dataSource type=BinURLDataSource name=dataUrl/
  dataSource type=URLDataSource baseUrl=
 http://127.0.0.1/tkb/internet/; name=main/
 document
  entity name=rec processor=XPathEntityProcessor
 url=docImportUrl.xml forEach=/docs/doc dataSource=main
  field column=title xpath=//title /
  field column=id xpath=//id /
  field column=file xpath=//file /
  field column=path xpath=//path /
  field column=url xpath=//url /
  field column=Author xpath=//author /
 
  entity name=tika processor=TikaEntityProcessor
 url=${rec.path}${rec.file} dataSource=dataUrl 
  !-- copyField source=text dest=text_test /
 --
  field column=text_test
 xpath=//div[@id='content'] /
  /entity
  /entity
 /document
 /dataConfig
 
 docImporterUrl.xml:
 
 ?xml version=1.0 encoding=utf-8?
 docs
 doc
  id5/id
  authortkb/author
  titleStartseite/title
  descriptionblabla .../description
  filehttp://localhost/tkb/internet/index.cfm/file
  urlhttp://localhost/tkb/internet/index.cfm/url/url
  path2http\specialConf/path2
  /doc
  doc
  id6/id
  authortkb/author
  titleEigenheim/title
  descriptionMachen Sie sich erste Gedanken über den
 Erwerb von Wohneigentum? Oder haben Sie bereits konkrete Pläne oder
 gar ein
 spruchreifes Projekt? Wir beraten Sie gerne in allen Fragen rund um den
 Erwerb oder Bau von Wohneigentum, damit Ihr Vorhaben auch in
 finanzieller
 Hinsicht gelingt./description
  file
 http://127.0.0.1/tkb/internet/private/beratung/eigenheim.htm/file
  url
 

Re: dataimporter tika fields empty

2013-08-23 Thread Andreas Owen
i changed following line (xpath): field column=text 
xpath=//div[@id='content'] name=text_test /

On 22. Aug 2013, at 10:06 PM, Alexandre Rafalovitch wrote:

 Ah. That's because Tika processor does not support path extraction. You
 need to nest one more level.
 
 Regards,
  Alex
 On 22 Aug 2013 13:34, Andreas Owen a...@conx.ch wrote:
 
 i can do it like this but then the content isn't copied to text. it's just
 in text_test
 
 entity name=tika processor=TikaEntityProcessor
 url=${rec.path}${rec.file} dataSource=dataUrl 
field column=text name=text_test
copyField source=text_test dest=text /
 /entity
 
 
 On 22. Aug 2013, at 6:12 PM, Andreas Owen wrote:
 
 i put it in the tika-entity as attribute, but it doesn't change
 anything. my bigger concern is why text_test isn't populated at all
 
 On 22. Aug 2013, at 5:27 PM, Alexandre Rafalovitch wrote:
 
 Can you try SOLR-4530 switch:
 https://issues.apache.org/jira/browse/SOLR-4530
 
 Specifically, setting htmlMapper=identity on the entity definition.
 This
 will tell Tika to send full HTML rather than a seriously stripped one.
 
 Regards,
 Alex.
 
 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)
 
 
 On Thu, Aug 22, 2013 at 11:02 AM, Andreas Owen a...@conx.ch wrote:
 
 i'm trying to index a html page and only user the div with the
 id=content. unfortunately nothing is working within the tika-entity,
 only
 the standard text (content) is populated.
 
  do i have to use copyField for test_text to get the data?
  or is there a problem with the entity-hirarchy?
  or is the xpath wrong, even though i've tried it without and just
 using text?
  or should i use the updateextractor?
 
 data-config.xml:
 
 dataConfig
  dataSource type=BinFileDataSource name=data/
  dataSource type=BinURLDataSource name=dataUrl/
  dataSource type=URLDataSource baseUrl=
 http://127.0.0.1/tkb/internet/; name=main/
 document
  entity name=rec processor=XPathEntityProcessor
 url=docImportUrl.xml forEach=/docs/doc dataSource=main
  field column=title xpath=//title /
  field column=id xpath=//id /
  field column=file xpath=//file /
  field column=path xpath=//path /
  field column=url xpath=//url /
  field column=Author xpath=//author /
 
  entity name=tika processor=TikaEntityProcessor
 url=${rec.path}${rec.file} dataSource=dataUrl 
  !-- copyField source=text dest=text_test /
 --
  field column=text_test
 xpath=//div[@id='content'] /
  /entity
  /entity
 /document
 /dataConfig
 
 docImporterUrl.xml:
 
 ?xml version=1.0 encoding=utf-8?
 docs
 doc
  id5/id
  authortkb/author
  titleStartseite/title
  descriptionblabla .../description
  filehttp://localhost/tkb/internet/index.cfm/file
  urlhttp://localhost/tkb/internet/index.cfm/url/url
  path2http\specialConf/path2
  /doc
  doc
  id6/id
  authortkb/author
  titleEigenheim/title
  descriptionMachen Sie sich erste Gedanken über den
 Erwerb von Wohneigentum? Oder haben Sie bereits konkrete Pläne oder
 gar ein
 spruchreifes Projekt? Wir beraten Sie gerne in allen Fragen rund um den
 Erwerb oder Bau von Wohneigentum, damit Ihr Vorhaben auch in
 finanzieller
 Hinsicht gelingt./description
  file
 http://127.0.0.1/tkb/internet/private/beratung/eigenheim.htm/file
  url
 http://127.0.0.1/tkb/internet/private/beratung/eigenheim.htm/url/url
  /doc
 /docs
 
 



Re: Query term count over a result set

2013-08-23 Thread Jack Krupansky
You can get the term frequency (per document) for a term using the 
termfreq() function query in the fl parameter:


fl=*,termfreq(field,'term')

-- Jack Krupansky

-Original Message- 
From: JZ

Sent: Friday, August 23, 2013 7:43 AM
To: solr-user@lucene.apache.org
Subject: Query term count over a result set

Hi all,

I would like to get the total count of a query term of a result set. Is
there a way to get this?

I know there is a TermVectorComponent that does this per result (document),
but it would be far too expensive to take the sum over all documents for a
term given that term.

The LukeRequestHandler and the terms component only present the term counts
over the whole index.

Thanks! 



Re: Query term count over a result set

2013-08-23 Thread Ahmet Arslan
Hi JZ,

You can use faceting component.

http://localhost:8080/solr/core/select?q=ahmetwt=xmlfacet=onfacet.field=titlefacet.prefix=queryTerm






 From: JZ zhangju...@gmail.com
To: solr-user@lucene.apache.org 
Sent: Friday, August 23, 2013 2:43 PM
Subject: Query term count over a result set
 

Hi all,

I would like to get the total count of a query term of a result set. Is
there a way to get this?

I know there is a TermVectorComponent that does this per result (document),
but it would be far too expensive to take the sum over all documents for a
term given that term.

The LukeRequestHandler and the terms component only present the term counts
over the whole index.

Thanks!

Re: Reloading synonyms and stop words

2013-08-23 Thread Bruno René Santos
Hi again,

Thanx for the help :)

I have this handler:

public class SynonymsHandler extends RequestHandlerBase implements
SolrCoreAware {

public SynonymsHandler() {}

private static Logger log = LoggerFactory.getLogger(SynonymsHandler.class);

@Override
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
throws Exception {
 System.out.println(req.getContext());
if (req.getContext().get(path).equals(/synonyms/update)) {}
 if (req.getContext().get(path).equals(/synonyms/get)) {}
req.getCore().reload(req.getCore());
 }

@Override
public String getDescription() {
 return null;
}

@Override
public String getSource() {
 return null;
}

@Override
public void inform(SolrCore core) {}

}

and when i call the reload I get this error:

63748 T33 C6 oasc.SolrException.log ERROR java.lang.NullPointerException
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
 at
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1693)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)

here

ShardHandler shardHandler1 = shardHandlerFactory.getShardHandler();

the factory is null... how can I get it to initialize? I checked on the
CoreAdminHandler that I have to do something on the inform like you said
but I am not sure what... the inform is recursive right? Could I try to
execute the CoreAdminHandler from within my Handler with the reload action?
I am not sure what is the best practice

Regards
Bruno


On Fri, Aug 23, 2013 at 11:20 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 I don't think that should be a problem. Your custom RequestHandler
 must call reload. Note that a new instance of your request handler
 will be created and inform will be called on it once reload happens
 i.e. you won't be able to keep any state in the request handler across
 core reloads.

 You can also do this at a level above RequestHandler i.e. via a custom
 CoreAdminHandler. See CoreAdminHandler.handleCustomAction()

 On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com
 wrote:
  Great! What about inside a RequestHandler source code in Java? I want to
  create a requestHandler that receives new synonyms, insert them on the
  synonyms file and reload the core.
 
  Regards
  Bruno
 
 
  On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  Yes, you can use the Core RELOAD command:
 
 
 
 https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D
 
  On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com
 
  wrote:
   Hello,
  
   Is it possible to reload the synonyms and stopwords files without
  rebooting
   solr?
  
   Regards
   Bruno Santos
  
   --
   Bruno René Santos
   Lisboa - Portugal
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 
 
  --
  Bruno René Santos
  Lisboa - Portugal



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Bruno René Santos
Lisboa - Portugal


Re: Problem with importing tab-delimited csv file

2013-08-23 Thread Jack Krupansky
Your data file appears to use spaces rather than tabs.

-- Jack Krupansky

From: Rob Koeling Ai 
Sent: Friday, August 23, 2013 6:38 AM
To: solr-user@lucene.apache.org 
Subject: Problem with importing tab-delimited csv file


I'm having trouble importing a tab-delimited file with the csv update handler.

My data file looks like this:

id question answer url
q99 Who? You! none

When I send this data to Solr using Curl:

curl 
'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' 
--data @sample.tmp

All seems to be well:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime221/int/lst
/response


But when I query the development core, there is no data. I must be overlooking 
something trivial. I would appreciate if anyone could spot what!

  - Rob



 


Re: How to avoid underscore sign indexing problem?

2013-08-23 Thread Jack Krupansky
Exactly - Solr does not define the punctuation, UAX#29 defines it, and I 
have deciphered the UAX#29 rules and included them in my book. Some 
punctuation is always punctuation and always removed, and some is 
conditional on context - I tried to lay out all the implied rules.


-- Jack Krupansky

-Original Message- 
From: Steve Rowe

Sent: Friday, August 23, 2013 12:30 AM
To: solr-user@lucene.apache.org
Subject: Re: How to avoid underscore sign indexing problem?

Dan,

StandardTokenizer implements the word boundary rules from the Unicode Text 
Segmentation standard annex UAX#29:


  http://www.unicode.org/reports/tr29/#Word_Boundaries

Every character sequence within UAX#29 boundaries that contains a numeric or 
an alphabetic character is emitted as a term, and nothing else is emitted.


Punctuation can be included within a term, e.g. 1,248.99 or 192.168.1.1.

To split on underscores, you can convert underscores to e.g. spaces by 
adding PatternReplaeCharFilterFactory to your analyzer:


   charFilter class=solr.PatternReplaceCharFilterFactory pattern=_ 
replacement= /


This replacement will be performed prior to StandardTokenizer, which will 
then see token-splitting spaces instead of underscores.


Steve

On Aug 22, 2013, at 10:23 PM, Dan Davis dansm...@gmail.com wrote:


Ah, but what is the definition of punctuation in Solr?


On Wed, Aug 21, 2013 at 11:15 PM, Jack Krupansky 
j...@basetechnology.comwrote:



I thought that the StandardTokenizer always split on punctuation, 

Proving that you haven't read my book! The section on the standard
tokenizer details the rules that the tokenizer uses (in addition to
extensive examples.) That's what I mean by deep dive.

-- Jack Krupansky

-Original Message- From: Shawn Heisey
Sent: Wednesday, August 21, 2013 10:41 PM
To: solr-user@lucene.apache.org
Subject: Re: How to avoid underscore sign indexing problem?


On 8/21/2013 7:54 PM, Floyd Wu wrote:


When using StandardAnalyzer to tokenize string Pacific_Rim will get

ST
textraw_**bytesstartendtypeposition
pacific_rim[70 61 63 69 66 69 63 5f 72 69 6d]011ALPHANUM1

How to make this string to be tokenized to these two tokens Pacific,
Rim?
Set _ as stopword?
Please kindly help on this.
Many thanks.



Interesting.  I thought that the StandardTokenizer always split on
punctuation, but apparently that's not the case for the underscore
character.

You can always use the WordDelimeterFilter after the StandardTokenizer.

http://wiki.apache.org/solr/**AnalyzersTokenizersTokenFilter**s#solr.**
WordDelimiterFilterFactoryhttp://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

Thanks,
Shawn



RE: how to integrate solr with HDFS HA

2013-08-23 Thread Greg Walters
Finally something I can help with! I went through the same problems you're 
having a short while ago. Check out 
https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS for most 
of the information you need and be sure to check the comments on the page as 
well.

Here's an example from my working setup:

**
  directoryFactory name=DirectoryFactory
class=solr.HdfsDirectoryFactory
bool name=solr.hdfs.blockcache.enabledtrue/bool
int name=solr.hdfs.blockcache.slab.count1/int
bool name=solr.hdfs.blockcache.direct.memory.allocationtrue/bool
int name=solr.hdfs.blockcache.blocksperbank16384/int
bool name=solr.hdfs.blockcache.read.enabledtrue/bool
bool name=solr.hdfs.blockcache.write.enabledtrue/bool
bool name=solr.hdfs.nrtcachingdirectory.enabletrue/bool
int name=solr.hdfs.nrtcachingdirectory.maxmergesizemb16/int
int name=solr.hdfs.nrtcachingdirectory.maxcachedmb192/int
str name=solr.hdfs.homehdfs://nameservice1:8020/solr/str
str name=solr.hdfs.confdir/etc/hadoop/conf.cloudera.hdfs1/str
  /directoryFactory
**

Thanks,
Greg

-Original Message-
From: YouPeng Yang [mailto:yypvsxf19870...@gmail.com] 
Sent: Friday, August 23, 2013 1:16 AM
To: solr-user@lucene.apache.org
Subject: how to integrate solr with HDFS HA

Hi all
I try to integrate solr with HDFS HA.When I start the solr server, it comes 
out an exeception[1].
And I do know this is because the hadoop.conf.Configuration  in 
HdfsDirectoryFactory.java does not include the HA configuration.
So I want to know ,in solr,is there any way to include my hadoop  HA 
configuration ?


[1]---
Caused by: java.lang.IllegalArgumentException:
java.net.UnknownHostException: lklcluster
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:415)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:382)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:123)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2277)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2311)
at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2299)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:364)
at
org.apache.solr.store.hdfs.HdfsDirectory.init(HdfsDirectory.java:59)
at
org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:154)
at
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:350)
at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:256)
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:469)
at org.apache.solr.core.SolrCore.init(SolrCore.java:759)


RE: Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying

2013-08-23 Thread Greg Walters
If you're using the bundled jetty that comes with the download, check the 
etc/jetty.xml property for maxIdleTime and set it appropriately. I get that 
error when operations take longer than the property is set to and time out. Do 
note that the property is specified in milliseconds!

Thanks,
Greg



-Original Message-
From: aniljayanti [mailto:aniljaya...@yahoo.co.in] 
Sent: Thursday, August 22, 2013 11:44 PM
To: solr-user@lucene.apache.org
Subject: Caused by: java.net.SocketException: Connection reset by peer: socket 
write error solr querying

Hi,

I am working on solr 4.4 jetty, and generated the index on 3350128
records. Now i want to test the query performance. So applied load test with 
time of 5 minutes, and 600 virtual users for different solr queries. After test 
completion got below errors.

ERROR - 2013-08-23 09:49:43.867; org.apache.solr.common.SolrException;
null:org.eclipse.jetty.io.EofException
at 
org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:914)
at
org.eclipse.jetty.http.AbstractGenerator.blockForOutput(AbstractGenerator.java:507)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:170)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:190)
at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
at org.apache.solr.util.FastWriter.write(FastWriter.java:126)
at java.io.Writer.write(Writer.java:140)
at org.apache.solr.response.XMLWriter.startTag(XMLWriter.java:144)
at org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:347)
at org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:295)
at org.apache.solr.schema.StrField.write(StrField.java:67)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:130)
at org.apache.solr.response.XMLWriter.writeArray(XMLWriter.java:273)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:190)
at 
org.apache.solr.response.XMLWriter.writeSolrDocument(XMLWriter.java:199)
at
org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:275)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:172)
at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:111)
at
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:39)
at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:647)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:375)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at

Re: Reloading synonyms and stop words

2013-08-23 Thread Shalin Shekhar Mangar
Actually I was suggesting that you execute the CoreAdminHandler from
within your handler or you can try calling CoreContainer.reload
directly.

On Fri, Aug 23, 2013 at 6:13 PM, Bruno René Santos brunor...@gmail.com wrote:
 Hi again,

 Thanx for the help :)

 I have this handler:

 public class SynonymsHandler extends RequestHandlerBase implements
 SolrCoreAware {

 public SynonymsHandler() {}

 private static Logger log = LoggerFactory.getLogger(SynonymsHandler.class);

 @Override
 public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
 throws Exception {
  System.out.println(req.getContext());
 if (req.getContext().get(path).equals(/synonyms/update)) {}
  if (req.getContext().get(path).equals(/synonyms/get)) {}
 req.getCore().reload(req.getCore());
  }

 @Override
 public String getDescription() {
  return null;
 }

 @Override
 public String getSource() {
  return null;
 }

 @Override
 public void inform(SolrCore core) {}

 }

 and when i call the reload I get this error:

 63748 T33 C6 oasc.SolrException.log ERROR java.lang.NullPointerException
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
  at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
  at
 org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
 at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1693)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:724)

 here

 ShardHandler shardHandler1 = shardHandlerFactory.getShardHandler();

 the factory is null... how can I get it to initialize? I checked on the
 CoreAdminHandler that I have to do something on the inform like you said
 but I am not sure what... the inform is recursive right? Could I try to
 execute the CoreAdminHandler from within my Handler with the reload action?
 I am not sure what is the best practice

 Regards
 Bruno


 On Fri, Aug 23, 2013 at 11:20 AM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 I don't think that should be a problem. Your custom RequestHandler
 must call reload. Note that a new instance of your request handler
 will be created and inform will be called on it once reload happens
 i.e. you won't be able to keep any state in the request handler across
 core reloads.

 You can also do this at a level above RequestHandler i.e. via a custom
 CoreAdminHandler. See CoreAdminHandler.handleCustomAction()

 On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com
 wrote:
  Great! What about inside a RequestHandler source code in Java? I want to
  create a requestHandler that receives new synonyms, insert them on the
  synonyms file and reload the core.
 
  Regards
  Bruno
 
 
  On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  Yes, you can use the Core RELOAD command:
 
 
 
 https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D
 
  On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com
 
  wrote:
   Hello,
  
   Is it possible to reload the synonyms and stopwords files without
  rebooting
   solr?
  
   Regards
   Bruno Santos
  
   --
   Bruno René Santos
   Lisboa - Portugal
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 
 
  --
  Bruno René Santos
  Lisboa - Portugal



 --
 Regards,
 Shalin Shekhar Mangar.




 --
 Bruno René Santos
 Lisboa - Portugal



-- 
Regards,
Shalin Shekhar Mangar.


Re: Reloading synonyms and stop words

2013-08-23 Thread Bruno René Santos
req.getCore().getCoreDescriptor().getCoreContainer().reload(req.getCore().getName());

works like a charm :) thanx a lot

Bruno


On Fri, Aug 23, 2013 at 2:48 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Actually I was suggesting that you execute the CoreAdminHandler from
 within your handler or you can try calling CoreContainer.reload
 directly.

 On Fri, Aug 23, 2013 at 6:13 PM, Bruno René Santos brunor...@gmail.com
 wrote:
  Hi again,
 
  Thanx for the help :)
 
  I have this handler:
 
  public class SynonymsHandler extends RequestHandlerBase implements
  SolrCoreAware {
 
  public SynonymsHandler() {}
 
  private static Logger log =
 LoggerFactory.getLogger(SynonymsHandler.class);
 
  @Override
  public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
 rsp)
  throws Exception {
   System.out.println(req.getContext());
  if (req.getContext().get(path).equals(/synonyms/update)) {}
   if (req.getContext().get(path).equals(/synonyms/get)) {}
  req.getCore().reload(req.getCore());
   }
 
  @Override
  public String getDescription() {
   return null;
  }
 
  @Override
  public String getSource() {
   return null;
  }
 
  @Override
  public void inform(SolrCore core) {}
 
  }
 
  and when i call the reload I get this error:
 
  63748 T33 C6 oasc.SolrException.log ERROR java.lang.NullPointerException
  at
 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
   at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
   at
 
 org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
  at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1693)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 
  here
 
  ShardHandler shardHandler1 = shardHandlerFactory.getShardHandler();
 
  the factory is null... how can I get it to initialize? I checked on the
  CoreAdminHandler that I have to do something on the inform like you said
  but I am not sure what... the inform is recursive right? Could I try to
  execute the CoreAdminHandler from within my Handler with the reload
 action?
  I am not sure what is the best practice
 
  Regards
  Bruno
 
 
  On Fri, Aug 23, 2013 at 11:20 AM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  I don't think that should be a problem. Your custom RequestHandler
  must call reload. Note that a new instance of your request handler
  will be created and inform will be called on it once reload happens
  i.e. you won't be able to keep any state in the request handler across
  core reloads.
 
  You can also do this at a level above RequestHandler i.e. via a custom
  CoreAdminHandler. See CoreAdminHandler.handleCustomAction()
 
  On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com
 
  wrote:
   Great! What about inside a RequestHandler source code in Java? I want
 to
   create a requestHandler that receives new synonyms, insert them on the
   synonyms file and reload the core.
  
   Regards
   Bruno
  
  
   On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar 
   shalinman...@gmail.com wrote:
  
   Yes, you can use the Core RELOAD command:
  
  
  
 
 https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D
  
   On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos 
 brunor...@gmail.com
  
   wrote:
Hello,
   
Is it possible to reload the synonyms and stopwords files without
   rebooting
solr?
   
Regards
Bruno Santos
   
--
Bruno René Santos
Lisboa - Portugal
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
  
  
  
   --
   Bruno René Santos
   Lisboa - Portugal
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 
 
  --
  Bruno René Santos
  Lisboa - Portugal



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Bruno René Santos
Lisboa - Portugal


Re: Query term count over a result set

2013-08-23 Thread JZ
Hello,

Ahmet, using the faceting component it gives me the document count for a
term, while I am interested in the term counts within a document for a
query term.

Jack, the functionquery termfreq returns indeed the term frequency per
document, but not over a result set.

How to do this over a result set?

I do not think there is something ready made, but perhaps a pointer to a
plugin or some code (or explanation why this does not exist yet) would be
great!

Thanks


On Fri, Aug 23, 2013 at 2:42 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi JZ,

 You can use faceting component.


 http://localhost:8080/solr/core/select?q=ahmetwt=xmlfacet=onfacet.field=titlefacet.prefix=queryTerm





 
  From: JZ zhangju...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, August 23, 2013 2:43 PM
 Subject: Query term count over a result set


 Hi all,

 I would like to get the total count of a query term of a result set. Is
 there a way to get this?

 I know there is a TermVectorComponent that does this per result (document),
 but it would be far too expensive to take the sum over all documents for a
 term given that term.

 The LukeRequestHandler and the terms component only present the term counts
 over the whole index.

 Thanks!



Solrconfig.xml

2013-08-23 Thread Bruno René Santos
Is there any way inside a handleRequestBody on a RequestHandler to know the
directory where the core configuration is? (schema.xml, solrconfig.xml,
synonyms, etc)

Regards
Bruno

-- 
Bruno René Santos
Lisboa - Portugal


Index a database table?

2013-08-23 Thread Kamaljeet Kaur
Hello there,

I just got something to index mysql database talble:
http://wiki.apache.org/solr/DIHQuickStart

Pasted the following in config tag of solrconfig.xml file
(solr-4.4.0/example/solr/collection1/conf/solrconfig.xml):

requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
  str name=configdata-config.xml/str
/lst
/requestHandler

Altering the next code, making a data-config.xml file, I have written the
following, where I am not sure if driver, url and entity name are
correct or not. How do I know if they are wrong? Because the model name (
table name ) i.e. tcc_userprofile and its attributes are written in
query and I know they are right. New is myql database name.

dataConfig
  dataSource type=JdbcDataSource 
  driver=com.mysql.jdbc.Driver
  url=jdbc:mysql://localhost/New 
  user=root 
  password=password/
  document
entity name=id 
query=select id,first_name from tcc_userprorofile
/entity
  /document
/dataConfig




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-a-database-table-tp4086334.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solrconfig.xml

2013-08-23 Thread Andrea Gazzarini
Yes, if your RequestHandler implements SolrCoreAware you will get a 
SolrCore reference in inform(...) method. In SolrCore you have all what 
you need (specifically SolrResourceLoader is what you need)


Note that if your request handler is a  SearchHandler you don't need to 
implement that interface because it already does.


Best,
Andrea

On 08/23/2013 04:23 PM, Bruno René Santos wrote:

Is there any way inside a handleRequestBody on a RequestHandler to know the
directory where the core configuration is? (schema.xml, solrconfig.xml,
synonyms, etc)

Regards
Bruno





Re: Index a database table?

2013-08-23 Thread Andrea Gazzarini

Seems ok assuming that

- you have mysql driver jar in your $SOLR_HOME/lib
- New is database name
- user root / password is valid
- table exists
- SOLR has a schema with the following id and first_name fields declared

About How do I know if they are wrong?

Why don't you try?


On 08/23/2013 04:31 PM, Kamaljeet Kaur wrote:

Hello there,

I just got something to index mysql database talble:
http://wiki.apache.org/solr/DIHQuickStart

Pasted the following in config tag of solrconfig.xml file
(solr-4.4.0/example/solr/collection1/conf/solrconfig.xml):

requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
   str name=configdata-config.xml/str
/lst
/requestHandler

Altering the next code, making a data-config.xml file, I have written the
following, where I am not sure if driver, url and entity name are
correct or not. How do I know if they are wrong? Because the model name (
table name ) i.e. tcc_userprofile and its attributes are written in
query and I know they are right. New is myql database name.

dataConfig
   dataSource type=JdbcDataSource
   driver=com.mysql.jdbc.Driver
   url=jdbc:mysql://localhost/New
   user=root
   password=password/
   document
 entity name=id
 query=select id,first_name from tcc_userprorofile
 /entity
   /document
/dataConfig




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-a-database-table-tp4086334.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Solrconfig.xml

2013-08-23 Thread Bruno René Santos
That is what I needed

req.getCore().getResourceLoader().getConfigDir()

Thanx
Bruno


On Fri, Aug 23, 2013 at 3:37 PM, Andrea Gazzarini 
andrea.gazzar...@gmail.com wrote:

 Yes, if your RequestHandler implements SolrCoreAware you will get a
 SolrCore reference in inform(...) method. In SolrCore you have all what you
 need (specifically SolrResourceLoader is what you need)

 Note that if your request handler is a  SearchHandler you don't need to
 implement that interface because it already does.

 Best,
 Andrea


 On 08/23/2013 04:23 PM, Bruno René Santos wrote:

 Is there any way inside a handleRequestBody on a RequestHandler to know
 the
 directory where the core configuration is? (schema.xml, solrconfig.xml,
 synonyms, etc)

 Regards
 Bruno





-- 
Bruno René Santos
Lisboa - Portugal


Problem with importing tab-delimited csv file

2013-08-23 Thread Rob Koeling Ai

Thanks for the reply, Jack.

It only looks like spaces, because I did a cut-and-paste. The file in question 
does contain tabs instead of spaces, i.e.:

idquestion  answerurl
q99   Who?  You!  none


Another question I means to ask, is whether this sort of activity is logged 
anywhere. I mean, after adding or deleting data, is there somewhere a record of 
that action?
The 'logging' tab on the Dashboard page only reports errors as far as I can see.

Thanks,

   - Rob



Your data file appears to use spaces rather than tabs.

-- Jack Krupansky

From: Rob Koeling Ai 
Sent: Friday, August 23, 2013 6:38 AM
To: solr-user@lucene.apache.org 
Subject: Problem with importing tab-delimited csv file


I'm having trouble importing a tab-delimited file with the csv update handler.

My data file looks like this:

id question answer url
q99 Who? You! none

When I send this data to Solr using Curl:

curl 
'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' 
--data
@sample.tmp

All seems to be well:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime221/int/lst
/response


But when I query the development core, there is no data. I must be overlooking 
something trivial.
I would appreciate if anyone could spot what!

  - Rob






Re: Storing query results

2013-08-23 Thread jfeist
I completely agree.  I would prefer to just rerun the search each time. 
However, we are going to be replacing our rdb based search with something
like Solr, and the application currently behaves this way.  Our users
understand that the search is essentially a snapshot (and I would guess many
prefer this over changing results) and we don't want to change existing
behavior and confuse anyone.  Also, my boss told me it unequivocally has to
be this way :p

Thanks for your input though, looks like I'm going to have to do something
like you've suggested within our application.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Storing-query-results-tp4086182p4086349.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR Prevent solr of modifying fields when update doc

2013-08-23 Thread Greg Preston
Perhaps an atomic update that only changes the fields you want to change?

-Greg


On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
meligalet...@gmail.com wrote:
 Hi thanks by the answer, but the uniqueId is generated by me. But when solr 
 indexes and there is an update in a doc, it deletes the doc and creates a new 
 one, so it generates a new UUID.
 It is not suitable for me, because i want that solr just updates some fields, 
 because the UUID is the key that i use to map it to an user in my database.

 Right now i'm using information that comes from the source and never chages, 
 as my uniqueId, like for example the guid, that exists in some rss feeds, or 
 if it doesn't exists i use link.

 I think there is any simple solution for me, because for what i have read, 
 when an update to a doc exists, SOLR deletes the old one and create a new 
 one, right?

 On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote:

 Well, not much in the way of help because you can't do what you
 want AFAIK. I don't think UUID is suitable for your use-case. Why not
 use your uniqueId?

 Or generate something yourself...

 Best
 Erick


 On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com
 wrote:

 Hi,

 How can i prevent solr from update some fields when updating a doc?
 The problem is, i have an uuid with the field name uuid, but it is not an
 unique key. When a rss source updates a feed, solr will update the doc with
 the same link but it generates a new uuid. This is not the desired because
 this id is used by me to relate feeds with an user.

 Can someone help me?

 Many Thanks



Suspicious message with attachment

2013-08-23 Thread help
The following message addressed to you was quarantined because it likely 
contains a virus:

Subject: Problem with SolrCloud + Zookeeper
From: =?GB2312?B?0MvMzsvv?= sunxing...@gmail.com

However, if you know the sender and are expecting an attachment, please reply 
to this message, and we will forward the quarantined message to you.


Re: Problem with importing tab-delimited csv file

2013-08-23 Thread Jack Krupansky

You need the CSV content type header and --data-binary.

I tried this with Solr 4.4:

curl 'http://localhost:8983/solr/update?commit=trueseparator=%09' -H 
'Content-type:application/csv' --data-binary @sample.tmp


Otherwise, Solr just ignores the request.

-- Jack Krupansky

-Original Message- 
From: Rob Koeling Ai

Sent: Friday, August 23, 2013 9:41 AM
To: solr-user@lucene.apache.org
Subject: Problem with importing tab-delimited csv file


Thanks for the reply, Jack.

It only looks like spaces, because I did a cut-and-paste. The file in 
question does contain tabs instead of spaces, i.e.:


id question answer url
q99 Who? You! none


Another question I means to ask, is whether this sort of activity is logged 
anywhere. I mean, after adding or deleting data, is there somewhere a record 
of that action?
The 'logging' tab on the Dashboard page only reports errors as far as I can 
see.


Thanks,

  - Rob



Your data file appears to use spaces rather than tabs.

-- Jack Krupansky

From: Rob Koeling Ai
Sent: Friday, August 23, 2013 6:38 AM
To: solr-user@lucene.apache.org
Subject: Problem with importing tab-delimited csv file


I'm having trouble importing a tab-delimited file with the csv update 
handler.


My data file looks like this:

id question answer url
q99 Who? You! none

When I send this data to Solr using Curl:

curl 
'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' 
--data

@sample.tmp

All seems to be well:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime221/int/lst

/response


But when I query the development core, there is no data. I must be 
overlooking something trivial.

I would appreciate if anyone could spot what!

 - Rob






solr built with maven

2013-08-23 Thread Bruno René Santos
Hello all,

I am building Solr's source code through maven in order to develop on top
of it on Netbeans (As no ant task was made to Netbeans... not cool!).

Three doubts about that:

1. How can I execute the solr server?
2. How can i debug the solr server?
3. If I create new packages (RequestHandlers, TOkenizers, etc) where can I
put them so that the compilation process will view the new files?

Regards
Bruno Santos

-- 
Bruno René Santos
Lisboa - Portugal


Re: solr built with maven

2013-08-23 Thread Brendan Grainger
You want to change the solr source code itself or you want to create your
own Tokenizers and things? If the later why not just set up solr as a
dependency in your pom.xml like so:

 dependency

 groupIdorg.apache.lucene/groupId

 artifactIdlucene-test-framework/artifactId

 scopetest/scope

 version${solr.version}/version

 /dependency


 dependency

 groupIdorg.apache.solr/groupId

 artifactIdsolr-test-framework/artifactId

 scopetest/scope

 version${solr.version}/version

 /dependency


dependency

 groupIdorg.apache.lucene/groupId

 artifactIdlucene-core/artifactId

 version${solr.version}/version

 /dependency


 dependency

 groupIdorg.apache.lucene/groupId

 artifactIdlucene-facet/artifactId

 version${solr.version}/version

 /dependency


 dependency

 groupIdorg.apache.solr/groupId

 artifactIdsolr/artifactId

 version${solr.version}/version

 typewar/type

 /dependency


 dependency

 groupIdorg.apache.solr/groupId

 artifactIdsolr-core/artifactId

 version${solr.version}/version

 /dependency

 dependency

 groupIdorg.apache.solr/groupId

 artifactIdsolr-solrj/artifactId

 version${solr.version}/version

 /dependency


 dependency

 groupIdorg.apache.solr/groupId

 artifactIdsolr-langid/artifactId

 version${solr.version}/version

 /dependency

 dependency

 groupIdlog4j/groupId

 artifactIdlog4j/artifactId

 version1.2.16/version

 /dependency

 dependency

 groupIdcommons-cli/groupId

 artifactIdcommons-cli/artifactId

 version1.2/version

 /dependency


 dependency

 groupIdjavax.servlet/groupId

 artifactIdservlet-api/artifactId

 version2.5/version

 /dependency


On Fri, Aug 23, 2013 at 12:24 PM, Bruno René Santos brunor...@gmail.comwrote:

 Hello all,

 I am building Solr's source code through maven in order to develop on top
 of it on Netbeans (As no ant task was made to Netbeans... not cool!).

 Three doubts about that:

 1. How can I execute the solr server?
 2. How can i debug the solr server?
 3. If I create new packages (RequestHandlers, TOkenizers, etc) where can I
 put them so that the compilation process will view the new files?

 Regards
 Bruno Santos

 --
 Bruno René Santos
 Lisboa - Portugal




-- 
Brendan Grainger
www.kuripai.com


Re: solr built with maven

2013-08-23 Thread Bruno René Santos
I dont want to change solr just extend it, but it would be nice to have the
source code on the project so that I can debug it in Netbeans. Do I need to
include jetty too? By the way (this is a little off-topic sorry) do you
know any site that explains how maven works in a straight-forward way? All
this magic is a little confusing sometimes...

Regards
Bruno




On Fri, Aug 23, 2013 at 5:46 PM, Brendan Grainger 
brendan.grain...@gmail.com wrote:

 You want to change the solr source code itself or you want to create your
 own Tokenizers and things? If the later why not just set up solr as a
 dependency in your pom.xml like so:

  dependency

  groupIdorg.apache.lucene/groupId

  artifactIdlucene-test-framework/artifactId

  scopetest/scope

  version${solr.version}/version

  /dependency


  dependency

  groupIdorg.apache.solr/groupId

  artifactIdsolr-test-framework/artifactId

  scopetest/scope

  version${solr.version}/version

  /dependency


 dependency

  groupIdorg.apache.lucene/groupId

  artifactIdlucene-core/artifactId

  version${solr.version}/version

  /dependency


  dependency

  groupIdorg.apache.lucene/groupId

  artifactIdlucene-facet/artifactId

  version${solr.version}/version

  /dependency


  dependency

  groupIdorg.apache.solr/groupId

  artifactIdsolr/artifactId

  version${solr.version}/version

  typewar/type

  /dependency


  dependency

  groupIdorg.apache.solr/groupId

  artifactIdsolr-core/artifactId

  version${solr.version}/version

  /dependency

  dependency

  groupIdorg.apache.solr/groupId

  artifactIdsolr-solrj/artifactId

  version${solr.version}/version

  /dependency


  dependency

  groupIdorg.apache.solr/groupId

  artifactIdsolr-langid/artifactId

  version${solr.version}/version

  /dependency

  dependency

  groupIdlog4j/groupId

  artifactIdlog4j/artifactId

  version1.2.16/version

  /dependency

  dependency

  groupIdcommons-cli/groupId

  artifactIdcommons-cli/artifactId

  version1.2/version

  /dependency


  dependency

  groupIdjavax.servlet/groupId

  artifactIdservlet-api/artifactId

  version2.5/version

  /dependency


 On Fri, Aug 23, 2013 at 12:24 PM, Bruno René Santos brunor...@gmail.com
 wrote:

  Hello all,
 
  I am building Solr's source code through maven in order to develop on top
  of it on Netbeans (As no ant task was made to Netbeans... not cool!).
 
  Three doubts about that:
 
  1. How can I execute the solr server?
  2. How can i debug the solr server?
  3. If I create new packages (RequestHandlers, TOkenizers, etc) where can
 I
  put them so that the compilation process will view the new files?
 
  Regards
  Bruno Santos
 
  --
  Bruno René Santos
  Lisboa - Portugal
 



 --
 Brendan Grainger
 www.kuripai.com




-- 
Bruno René Santos
Lisboa - Portugal


Re: solr built with maven

2013-08-23 Thread Michael Della Bitta
Hi Bruno,

IntelliJ IDEA has a one-click way of downloading the source jars of
dependencies into your project. I'd look for something similar in Netbeans
rather than trying to hack together a Maven build of Solr yourself.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062  | c: +1 917 477 7906

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Fri, Aug 23, 2013 at 12:51 PM, Bruno René Santos brunor...@gmail.comwrote:

 I dont want to change solr just extend it, but it would be nice to have the
 source code on the project so that I can debug it in Netbeans. Do I need to
 include jetty too? By the way (this is a little off-topic sorry) do you
 know any site that explains how maven works in a straight-forward way? All
 this magic is a little confusing sometimes...

 Regards
 Bruno




 On Fri, Aug 23, 2013 at 5:46 PM, Brendan Grainger 
 brendan.grain...@gmail.com wrote:

  You want to change the solr source code itself or you want to create your
  own Tokenizers and things? If the later why not just set up solr as a
  dependency in your pom.xml like so:
 
   dependency
 
   groupIdorg.apache.lucene/groupId
 
   artifactIdlucene-test-framework/artifactId
 
   scopetest/scope
 
   version${solr.version}/version
 
   /dependency
 
 
   dependency
 
   groupIdorg.apache.solr/groupId
 
   artifactIdsolr-test-framework/artifactId
 
   scopetest/scope
 
   version${solr.version}/version
 
   /dependency
 
 
  dependency
 
   groupIdorg.apache.lucene/groupId
 
   artifactIdlucene-core/artifactId
 
   version${solr.version}/version
 
   /dependency
 
 
   dependency
 
   groupIdorg.apache.lucene/groupId
 
   artifactIdlucene-facet/artifactId
 
   version${solr.version}/version
 
   /dependency
 
 
   dependency
 
   groupIdorg.apache.solr/groupId
 
   artifactIdsolr/artifactId
 
   version${solr.version}/version
 
   typewar/type
 
   /dependency
 
 
   dependency
 
   groupIdorg.apache.solr/groupId
 
   artifactIdsolr-core/artifactId
 
   version${solr.version}/version
 
   /dependency
 
   dependency
 
   groupIdorg.apache.solr/groupId
 
   artifactIdsolr-solrj/artifactId
 
   version${solr.version}/version
 
   /dependency
 
 
   dependency
 
   groupIdorg.apache.solr/groupId
 
   artifactIdsolr-langid/artifactId
 
   version${solr.version}/version
 
   /dependency
 
   dependency
 
   groupIdlog4j/groupId
 
   artifactIdlog4j/artifactId
 
   version1.2.16/version
 
   /dependency
 
   dependency
 
   groupIdcommons-cli/groupId
 
   artifactIdcommons-cli/artifactId
 
   version1.2/version
 
   /dependency
 
 
   dependency
 
   groupIdjavax.servlet/groupId
 
   artifactIdservlet-api/artifactId
 
   version2.5/version
 
   /dependency
 
 
  On Fri, Aug 23, 2013 at 12:24 PM, Bruno René Santos brunor...@gmail.com
  wrote:
 
   Hello all,
  
   I am building Solr's source code through maven in order to develop on
 top
   of it on Netbeans (As no ant task was made to Netbeans... not cool!).
  
   Three doubts about that:
  
   1. How can I execute the solr server?
   2. How can i debug the solr server?
   3. If I create new packages (RequestHandlers, TOkenizers, etc) where
 can
  I
   put them so that the compilation process will view the new files?
  
   Regards
   Bruno Santos
  
   --
   Bruno René Santos
   Lisboa - Portugal
  
 
 
 
  --
  Brendan Grainger
  www.kuripai.com
 



 --
 Bruno René Santos
 Lisboa - Portugal



Re: Problem with importing tab-delimited csv file

2013-08-23 Thread Aloke Ghoshal
Hi Rob,

I think the wrong Content-type header is getting passed. Try one of these
instead:

curl '
http://localhost:8983/solr/update/csv?commit=trueseparator=%09stream.file=/tmp/sample.tmp
'

OR

curl 'http://localhost:8983/solr/update/csv?commit=trueseparator=%09' -H
'Content-type:application/csv; charset=utf-8' --data-binary @sample.tmp

Regards,
Aloke


On Fri, Aug 23, 2013 at 6:15 PM, Jack Krupansky j...@basetechnology.comwrote:

   Your data file appears to use spaces rather than tabs.

 -- Jack Krupansky

  *From:* Rob Koeling Ai rob.koel...@ai-applied.com
 *Sent:* Friday, August 23, 2013 6:38 AM
 *To:* solr-user@lucene.apache.org
 *Subject:* Problem with importing tab-delimited csv file


 I'm having trouble importing a tab-delimited file with the csv update
 handler.

 My data file looks like this:

  id question answer url
 q99 Who? You! none

 When I send this data to Solr using Curl:

  curl '
 http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09%27--data
  @sample.tmp

 All seems to be well:

  ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime221/int/lst
 /response


 But when I query the development core, there is no data. I must be
 overlooking something trivial. I would appreciate if anyone could spot what!

   - Rob







Schema.xml definition problem

2013-08-23 Thread Everton Garcia
Hello
I want to index the XML below with multivalued fields.
What better way to set the schema.xml since there are nested data?
Thank you.

  documento








id/ //String






descricao/ //String






data/ //Date






conteudo/ //String






assentamentos //Multivalued







assentamento //First register







id/ //String






nome/ //String






matricula/ //String






classificacoes //Multivalued







classificacao //First register







id/ //String






descricao/ //String






agrupadores //Multivalued







agrupador //First register







valor/ //String





agrupador/






/agrupadores






/classificacao






/classificacoes






/assentamento






/assentamentos






 /documento












-- 
*Everton Rodrigues Garcia*


Re: SOLR search by external fields

2013-08-23 Thread SolrLover
Did you look here?

https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-search-by-external-fields-tp4086197p4086408.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boost by numFounds

2013-08-23 Thread Flavio Pompermaier
Any help..? Is it possible to add this pagerank-like behaviour?


Re: Grouping

2013-08-23 Thread tvellore
I'm getting the same error...Is there any workaround to this?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Grouping-tp2820116p4086425.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR Prevent solr of modifying fields when update doc

2013-08-23 Thread Lance Norskog
Solr does not by default generate unique IDs. It uses what you give as 
your unique field, usually called 'id'.


What software do you use to index data from your RSS feeds? Maybe that 
is creating a new 'id' field?


There is no partial update, Solr (Lucene) always rewrites the complete 
document.


On 08/23/2013 09:03 AM, Greg Preston wrote:

Perhaps an atomic update that only changes the fields you want to change?

-Greg


On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso
meligalet...@gmail.com wrote:

Hi thanks by the answer, but the uniqueId is generated by me. But when solr 
indexes and there is an update in a doc, it deletes the doc and creates a new 
one, so it generates a new UUID.
It is not suitable for me, because i want that solr just updates some fields, 
because the UUID is the key that i use to map it to an user in my database.

Right now i'm using information that comes from the source and never chages, as 
my uniqueId, like for example the guid, that exists in some rss feeds, or if it 
doesn't exists i use link.

I think there is any simple solution for me, because for what i have read, when 
an update to a doc exists, SOLR deletes the old one and create a new one, right?

On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote:


Well, not much in the way of help because you can't do what you
want AFAIK. I don't think UUID is suitable for your use-case. Why not
use your uniqueId?

Or generate something yourself...

Best
Erick


On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com

wrote:
Hi,

How can i prevent solr from update some fields when updating a doc?
The problem is, i have an uuid with the field name uuid, but it is not an
unique key. When a rss source updates a feed, solr will update the doc with
the same link but it generates a new uuid. This is not the desired because
this id is used by me to relate feeds with an user.

Can someone help me?

Many Thanks




Re: Index a database table?

2013-08-23 Thread Kamaljeet Kaur
On Fri, Aug 23, 2013 at 11:17 PM, Andrea Gazzarini-3 [via Lucene]
ml-node+s472066n4086384...@n3.nabble.com wrote:
 Why don't you try?


Actually I wanted every single step to be clear, thats why I asked.
Now there is written:

Ensure that your solr schema (schema.xml) has the fields 'id',
'name', 'desc'. Change the appropriate details in the data-config.xml

My schema.xml is not having these fields. That means I have to declare
them. Can you tell me where? Where to declare them in that file? Isn't
there the same option as in solr 3.5.0, Using a command, schema was
built and we placed that output in schema.xml file?

Also its written:

Drop your JDBC driver jar file into the solr-home/lib directory.

It's a Java application to interact with database. Where is it? It
must be a .jar file, rest I don't know yet. My solr/example/lib
directory has an ext directory, jetty drivers and a
servlet-api-3.0.jar driver. Is it fine? Then which one is JDBC driver?



-- 
Kamaljeet Kaur

kamalkaur188.wordpress.com
facebook.com/kaur.188




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-a-database-table-tp4086334p4086437.html
Sent from the Solr - User mailing list archive at Nabble.com.