Re: Oracle Timestamp in SOLR

2013-05-17 Thread Peter Sch�tt
Hallo, : SELECT ... CAST(LAST_ACTION_TIMESTAMP AS DATE) AS LAT : : This removes the time part of the timestamp in SOLR. althought it is : shown in PL/SQL-Developer (Tool for Oracle). Hmmm... that makes no sense to me based on 10 seconds of googling...

Re: Deleting an entry from a collection when they key has : in it

2013-05-17 Thread Jason Hellman
The first rule of Solr without Unique Key is that we don't talk about Solr without a Unique Key. The second rule... On May 16, 2013, at 8:47 PM, Jack Krupansky j...@basetechnology.com wrote: Technically, core Solr does not require a unique key. A lot of features in Solr do require unique

Re: error while switching from log4j back to slf4j with solr 4.3

2013-05-17 Thread Bernd Fehling
Am 16.05.2013 17:19, schrieb Shawn Heisey: On 5/16/2013 3:24 AM, Bernd Fehling wrote: OK, solved. I have now run-jetty-run with log4j running. Just copied log4j libs from example/lib/ext to webapp/WEB-INF/classes and set -Dlog4j.configuration in run-jetty-run VM classpath. The location

Re: Solr 4 memory usage increase

2013-05-17 Thread Wei Zhao
No, exactly the same JVM of Java6 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-memory-usage-increase-tp4064066p4064108.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4 memory usage increase

2013-05-17 Thread Walter Underwood
It is past time to get off of Java 6. That is dead. End of life. No more updates, not even for security bugs. What version of Java 6? Some earlier versions had bad bugs that Solr would run into. We hit them in prod until we upgraded. wunder On May 16, 2013, at 11:28 PM, Wei Zhao wrote: No,

Java heap space exception in 4.2.1

2013-05-17 Thread J Mohamed Zahoor
Hi I moved to 4.2.1 from 4.1 recently.. everything was working fine until i added few more stats query.. Now i am getting this error frequently that solr does not run even for 2 minutes continuously. All 5GB is getting used instantaneously in few queries... SEVERE:

Re: Explicite update or delete of a dataset

2013-05-17 Thread Peter Sch�tt
Hallo, To delete: curl http://localhost:8983/solr/update?commit=true \ -H 'Content-type:application/json' \ -d '{delete: {id:doc-0001}}' I try it in this way: http://localhost:9180/solr/mycore/update?commit=truestream.body=delete

Re: Solr 4 memory usage increase

2013-05-17 Thread J Mohamed Zahoor
I get the same issue in 1.7.0_09-b05 also. ./zahoor On 17-May-2013, at 12:07 PM, Walter Underwood wun...@wunderwood.org wrote: It is past time to get off of Java 6. That is dead. End of life. No more updates, not even for security bugs. What version of Java 6? Some earlier versions had

Facet pivot 50.000.000 different values

2013-05-17 Thread Carlos Bonilla
Hi, To calculate some stats we are using a field B with 50.000. different values as facet pivot in a schema that contains 200.000.000 documents. We only need to calculate how many different B values have more than 1 document but it takes ages Is there any other better way/configuration to

Re: Using the Collections API

2013-05-17 Thread A.Eibner
Hi, sorry for the delay. I have two live nodes (also zookeeper knows these two [app02:9985_solrl,app03:9985_solr]) But when I want to create a collection via: http://app02:9985/solr/admin/collections?action=CREATEname=storagenumShards=1replicationFactor=2collection.configName=storage-conf

Re: Issue with getting highlight with hl.maxAnalyzedChars = -1

2013-05-17 Thread Dmitry Kan
Can you solve by retaining hl.maxAnalyzedChars=maxLength+buffer, where maxLength is the max length of your text field plus some reasonable buffer on top? On Tue, May 14, 2013 at 1:03 PM, meghana meghana.rav...@amultek.com wrote: Hi, Query pasted in my post, is returning 1 record with 0

Re: Using the Collections API

2013-05-17 Thread Jared Rodriguez
Hi Alexander, So it sounds like you want the collection created with a master and a replica and you want one to be on each node? If so, I believe that you can get that effect by specifying maxShardsPerNode=1 as part of your url line. This will tell solr to create the master and replica that you

Re: Facet pivot 50.000.000 different values

2013-05-17 Thread Carlos Bonilla
Sorry, 16 GB RAM (not 8). 2013/5/17 Carlos Bonilla carlosbonill...@gmail.com Hi, To calculate some stats we are using a field B with 50.000. different values as facet pivot in a schema that contains 200.000.000 documents. We only need to calculate how many different B values have more

Searching for terms having embedded white spaces like word1 word2

2013-05-17 Thread kobe.free.wo...@gmail.com
Hi Guys, I have a field defined with the following custom data type, fieldType name=cust_str class=solr.TextField positionIncrementGap=100 sortMissingLast=true analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory

Re: Using the Collections API

2013-05-17 Thread A.Eibner
Hi Jared, yes that is what I want to achieve: Creating a master and a replica and I want them to be separate nodes. I just realized that I posted the wrong URL, I was already using the parameter maxShardsPerNode=1. But just to be sure, I also tried it with your URL and I get the same

Re: Searching for terms having embedded white spaces like word1 word2

2013-05-17 Thread Jack Krupansky
Is this really a text field where you want to search for tokenized keywords? Or is it a string field where you wish strictly to deal with equality of the entire string or explicit wildcards for substring matches, as you've show. You haven't told us your full requirements for this field. The

Insane FieldCache usage when using group.facet=true

2013-05-17 Thread Elisabeth Adler
Dear all, I am running a grouped query including facets in my Junit Test cases against a Solr 4.2.1 Embedded Server. When faceting the groups, I want the counts to reflect the number of groups, not the number of documents. But when I enable group.facet=true on the query, the test fails with the

Re: Zookeeper Ensemble Startup Parameters For SolrCloud?

2013-05-17 Thread Mark Miller
The way Solr uses ZK, unless you are also using ZK with something else, I wouldn't worry about it at all. In a steady state, the cluster won't even really talk to ZK in any intensive manner at all. - Mark On May 16, 2013, at 5:07 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Shawn; I

Adding filed in Schema.xml

2013-05-17 Thread Kamal Palei
Hi All I am trying to add few fields in schema.xml file as below. field name=salary type=long indexed=true stored=true / field name=experience type=long indexed=true stored=true / * field name=last_updated_date type=tdate indexed=true stored=true default=NOW multiValued=false / *

Re: Solr cloud Some basic questions

2013-05-17 Thread Jack Krupansky
Start by simply experimenting with 2 shards and 2 replicas - 4 nodes. And just run zk on the nodes themselves for simple experiments. It's better to deploy zk separate from the Solr nodes, but for simple testing it shouldn't matter. Get experience with SolrCloud using a simple configuration

Re: indexing unrelated tables in single core

2013-05-17 Thread Gora Mohanty
On 16 May 2013 19:11, Rohan Thakur rohan.i...@gmail.com wrote: hi Mohanty I tried what you suggested of using id as common field and changing the SQL query to point to id and using id as uniqueKey it is working but now what it is doing is just keeping the id's that are not same in both the

Re: SurroundQParser does not analyze the query text

2013-05-17 Thread Isaac Hebsh
Thank you Erik and Jack. I opened a JIRA issue: https://issues.apache.org/jira/browse/SOLR-4834 I wish a will have time to sumbit a patch file soon. On Fri, May 17, 2013 at 7:38 AM, Jack Krupansky j...@basetechnology.comwrote: (Erik: Or he can get the LucidWorks Search product and then use

Bloom Filters

2013-05-17 Thread Isaac Hebsh
Hi everyone.. I'm indexing docs into Solr using the update request handler, by POSTing data to the REST endpoint (not SolrJ, not DIH). My indexer should return an indication, whether the document existed in the collection before or not, based in its ID. The obvious solution is the perform a

Re: Adding filed in Schema.xml

2013-05-17 Thread Alexandre Rafalovitch
Do you have the types corresponding to those fields present? Specifically, long. You don't get any special type names out of the box, they all need to be present in types area. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn:

RE: Speed up import of Hierarchical Data

2013-05-17 Thread Dyer, James
Using SqlEntityProcessor with cacheImpl=SortedMapBackedCache is the same as specifying CachedSqlEntityProcessor. Because the pluggable caches are only partially committed, I never added details to the wiki, so it still refers to CachedSEP. But its the same thing. What is new here, though, is

Keyword aware Tokenizer?

2013-05-17 Thread Kai Gülzau
Does anybody know of a tokenizer which can be configured with (multiple) regular expressions to mark some of the input text as keyword and behave like StandardTokenizer (or UAX29URLEmailTokenizer) otherwise? Input: Does my order 4711.0815!-somecode_and.other(stuff) arrive on friday? Tokens:

Re: Java heap space exception in 4.2.1

2013-05-17 Thread J Mohamed Zahoor
Hprof introspection shows that huge Double Array are using up 75% of heap space... which belongs to Lucen's FieldCache.. ./zahoor On 17-May-2013, at 12:47 PM, J Mohamed Zahoor zah...@indix.com wrote: Hi I moved to 4.2.1 from 4.1 recently.. everything was working fine until i added few

Re: Question about Edismax - Solr 4.0

2013-05-17 Thread Sandeep Mestry
Hello Jack, Thanks for pointing the issues out and for your valuable suggestion. My preliminary tests were okay on search but I will be doing more testing to see if this has impacted any other searches. Thanks once again and have a nice sunny weekend, Sandeep On 17 May 2013 05:35, Jack

Re: error while switching from log4j back to slf4j with solr 4.3

2013-05-17 Thread Shawn Heisey
On 5/17/2013 12:25 AM, Bernd Fehling wrote: Actually there is no real container in my eclipse debugging env :-( /opt/indigo/eclipse/configuration/org.eclipse.osgi/bundles/884/1/.cp/lib/jetty-webapp-8.1.2.v20120308.jar Then it should be copied to lib/ext of eclipse/run-jetty-run

Re: Facet pivot 50.000.000 different values

2013-05-17 Thread Shawn Heisey
On 5/17/2013 2:47 AM, Carlos Bonilla wrote: To calculate some stats we are using a field B with 50.000. different values as facet pivot in a schema that contains 200.000.000 documents. We only need to calculate how many different B values have more than 1 document but it takes ages Is

Re: Solr 4 memory usage increase

2013-05-17 Thread Andre Bois-Crettez
Can you explain your setup more ? ie. is it master/slave, indexing in parallel, etc ? We had to commit more often to reduce JVM memory usage due to transaction logs in SolrCloud mode, compared with previous setups without tlogs. update?commit=trueopenSearcher=false André On 05/17/2013 09:56

Re: Using the Collections API

2013-05-17 Thread Shawn Heisey
On 5/17/2013 4:03 AM, Jared Rodriguez wrote: So it sounds like you want the collection created with a master and a replica and you want one to be on each node? If so, I believe that you can get that effect by specifying maxShardsPerNode=1 as part of your url line. This will tell solr to

StandardTokenizer vs. hyphens

2013-05-17 Thread Kai Gülzau
Is there some StandardTokenizer Implementation which does not break words on hyphens? I think it would be more flexible to retain hyphens and use a WordDelimiterFactory to split these tokens. StandardTokenizer today: doc1: email - email doc2: e-mail - e|mail doc3: e mail - e|mail query1:

Re: Java heap space exception in 4.2.1

2013-05-17 Thread Shawn Heisey
On 5/17/2013 1:17 AM, J Mohamed Zahoor wrote: I moved to 4.2.1 from 4.1 recently.. everything was working fine until i added few more stats query.. Now i am getting this error frequently that solr does not run even for 2 minutes continuously. All 5GB is getting used instantaneously in few

Re: Using the Collections API

2013-05-17 Thread Mark Miller
What version of Solr? I think there was a bug a couple versions back (perhaps introduced in 4.1 if I remember right) that made it so creates were not spread correctly. - Mark

Re: StandardTokenizer vs. hyphens

2013-05-17 Thread Shawn Heisey
On 5/17/2013 10:26 AM, Kai Gülzau wrote: Is there some StandardTokenizer Implementation which does not break words on hyphens? I think it would be more flexible to retain hyphens and use a WordDelimiterFactory to split these tokens. You can use the whitespace tokenizer with WDF. This is

Bugs with edismax parser

2013-05-17 Thread anonymous user
Hi Solr Users, I've been migrating our existing app from Solr 3.6.2 to Solr 4.3 and I've come across some strange behaviour that I think demonstrate one or more bugs in the edismax parser. -- Setup -- with a clean copy of

Re: Is payload the right solution for my problem?

2013-05-17 Thread jasimop
I think I just found the solution. Would the right strategy be to store the original XML content and then use a solr.HTMLStripCharFilterFactory when querying? I just made a quick test and it work, the only problem now is that it also finds the data contained in the XML attribute fields. I think

Re: Java heap space exception in 4.2.1

2013-05-17 Thread J Mohamed Zahoor
Memory increase a lot with queries which have facets… ./Zahoor On 17-May-2013, at 10:00 PM, Shawn Heisey s...@elyograg.org wrote: On 5/17/2013 1:17 AM, J Mohamed Zahoor wrote: I moved to 4.2.1 from 4.1 recently.. everything was working fine until i added few more stats query.. Now i am

Re: Solr 4 memory usage increase

2013-05-17 Thread William Bell
Yeah how to turn off index writer ? On Friday, May 17, 2013, Andre Bois-Crettez wrote: Can you explain your setup more ? ie. is it master/slave, indexing in parallel, etc ? We had to commit more often to reduce JVM memory usage due to transaction logs in SolrCloud mode, compared with

Re: having trouble storing large text blob fields - returns binary address in search results

2013-05-17 Thread Gora Mohanty
On 17 May 2013 00:02, geeky2 gee...@hotmail.com wrote: [...] i have tried setting them up as clob fields - but this is not working (see details below) i have also tried treating them as plain string fields (removing the references to clob in the DIH) - but this does not work either. DIH

Re: Facet pivot 50.000.000 different values

2013-05-17 Thread Mikhail Khludnev
On Fri, May 17, 2013 at 12:47 PM, Carlos Bonilla carlosbonill...@gmail.comwrote: We only need to calculate how many different B values have more than 1 document but it takes ages Carlos, It's not clear whether you need to take results of a query into account or just gather statistics from

Upgrading from SOLR 3.5 to 4.2.1 Results.

2013-05-17 Thread Rishi Easwaran
Hi All, Its Friday 3:00pm, warm sunny outside and it was a good week. Figured I'd share some good news. I work for AOL mail team and we use SOLR for our mail search backend. We have been using it since pre-SOLR 1.4 and strong supporters of SOLR community. We deal with millions indexes and

Re: Solr 4 memory usage increase

2013-05-17 Thread Wei Zhao
Here is the JVM info: $ java -version java version 1.6.0_26 Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) -- View this message in context:

protect solr pages

2013-05-17 Thread gpssolr2020
Hi, i want implement security through jetty realm in solr4. So i configured related stuffs in realm.properties ,jetty.xml, webdefault.xml under /solrhome/example/etc. But still it is not working. Please advise. Thanks. -- View this message in context:

Re: Solr 4 memory usage increase

2013-05-17 Thread Wei Zhao
We have master/slave setup. We disabled autocommits/autosoftcommits. So the slave only replicates from master and serve query. Master does all the indexing and commit every 5 minutes. Slave polls master every 2.5 minutes and does replication. Both tests with Solr 3.5 and 4.2 was run with the same

Re: Upgrading from SOLR 3.5 to 4.2.1 Results.

2013-05-17 Thread Lance Norskog
This is great; data like this is rare. Can you tell us any hardware or throughput numbers? On 05/17/2013 12:29 PM, Rishi Easwaran wrote: Hi All, Its Friday 3:00pm, warm sunny outside and it was a good week. Figured I'd share some good news. I work for AOL mail team and we use SOLR for our

Re: Zookeeper Ensemble Startup Parameters For SolrCloud?

2013-05-17 Thread Furkan KAMACI
Hi Mark; Thanks for the answer. Does Solr nodes holds the current state of cluster (which Zookeeper ensemble knows) inside their cache/RAM? 2013/5/17 Mark Miller markrmil...@gmail.com The way Solr uses ZK, unless you are also using ZK with something else, I wouldn't worry about it at all. In

Re: Zookeeper Ensemble Startup Parameters For SolrCloud?

2013-05-17 Thread vsilgalis
As an example, I have 9 SOLR nodes (3 clusters of 3) using different versions of SOLR (4.1, 4.1, and 4.2.1), utilizing the same zookeeper ensemble (3 servers), using chroot for the different configs across clusters. My zookeeper servers are just VMs, dual-core with 1GB of RAM and are only used

Re: having trouble storing large text blob fields - returns binary address in search results

2013-05-17 Thread geeky2
Hello Gora, thank you for the reply - i did finally get this to work. i had to cast the column in the DIH to a clob - like this. cast(att.attr_val AS clob) as attr_val, cast(rsr.rsr_val AS clob) as rsr_val, once this was done, the ClobTransformer worked. to my knowledge - this

RE: Is payload the right solution for my problem?

2013-05-17 Thread Petersen, Robert
Hi It will not be double the disk space at all. You will not need to store the field you search, only the field being returned needs to be stored. Furthermore if you are not searching the XML field you will not need to index that field, only store it. Hope that helps, Robi -Original

Question about attributes

2013-05-17 Thread Thomas Portegys
First time on forum. We are planning to use Solr to house some data mining formation, and we are thinking of using attributes to add some semantic information to indexed content. As a test, I wrote a filter that adds an animal attribute to tokens like dog, cat, etc. After adding a document with

RE: Speed up import of Hierarchical Data

2013-05-17 Thread O. Olson
Thank you James. I think I got this to work using CachedSqlEntityProcessor – and it seems extremely fast. I will try SortedMapBackedCache on Monday :-). Thank you, O. O. Dyer, James-2 wrote Using SqlEntityProcessor with cacheImpl=SortedMapBackedCache is the same as specifying

Re: Question about attributes

2013-05-17 Thread Jack Krupansky
Maybe you want to set the payload for each term, based on your animal attribute. Then there is minimal support in Solr for payloads. There is no immediate filter for capturing an arbitrary attribute. Take a look at TypeAsPayloadTokenFilterFactory . You could do something similar, like

HttpClient version

2013-05-17 Thread Jamie Johnson
I am trying to use Solr inside of another framework (Storm) that provides a version of HttpClient (4.1.x) that is incompatible with the latest version that SolrJ requires (4.3.x). Is there a way to use the older version of HttpClient with SolrJ? Are there any issues with using an earlier SolrJ

Re: HttpClient version

2013-05-17 Thread Shawn Heisey
On 5/17/2013 5:06 PM, Jamie Johnson wrote: I am trying to use Solr inside of another framework (Storm) that provides a version of HttpClient (4.1.x) that is incompatible with the latest version that SolrJ requires (4.3.x). Is there a way to use the older version of HttpClient with SolrJ? Are

Re: protect solr pages

2013-05-17 Thread Tim Vaillancourt
A lot of people (including me) are asking for this type of support in this JIRA: https://issues.apache.org/jira/browse/SOLR-4470 Although brought up frequently on the list, the effort doesn't seem to be moving too much. I can confirm that the most recent patch on this JIRA will work with the

Re: having trouble storing large text blob fields - returns binary address in search results

2013-05-17 Thread Gora Mohanty
On 18 May 2013 02:24, geeky2 gee...@hotmail.com wrote: Hello Gora, thank you for the reply - i did finally get this to work. i had to cast the column in the DIH to a clob - like this. cast(att.attr_val AS clob) as attr_val, cast(rsr.rsr_val AS clob) as rsr_val, once this was