Re: Not able to search Spanish word with ascent in solr

2013-05-30 Thread Deep Lotia
Hi, I am having a same kind of issue. I am not able to search accented characters of spanish. For eg: - Según, próximos etc. I have field called attr_content which holds the content of a PDF file whose contents are in spanish. I am using Apache Tika to index the contents of a PDF file. I have

Automatic cross linking

2013-05-30 Thread It-forum
Hello, I'm looking to use Solr for creating cross linking in text. For exemple : I'll like to be able to request for a text field, an article, in my blog. And that Solr use a script/method, request to parse the text, find all matching categories term and caps the results. Do you have any

RE: Support for Mongolian language

2013-05-30 Thread Sagar Chaturvedi
I have already checked this link. Could not find any hint about Mongolian language. Is there any plugin available for that? -Original Message- From: bbarani [mailto:bbar...@gmail.com] Sent: Thursday, May 30, 2013 2:04 AM To: solr-user@lucene.apache.org Subject: Re: Support for Mongolian

Re: Problem with xpath expression in data-config.xml

2013-05-30 Thread Hans-Peter Stricker
Thanks for having analyzed the problem. But please let me note that I came to a somehow different conclusion. Define for the moment title to be the primary unique key: solr-4.3.0\example\example-DIH\solr\rss\conf\schema.xml uniqueKeytitle/uniqueKey

[DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread jerome . dupont
Hello, I want to use a index a huge list of xml file. _ Using FileListEntityProcessor causes an OutOfMemoryException (too many files...) _ I can do it using a LineEntityProcessor reading a list of files, generated externally, but I would prefer to generate the list in SOLR _ So to avoid to

Re: Query syntax error: Cannot parse ....

2013-05-30 Thread Yago Riveiro
Hi, Indeed, with character # encoded the query works fine. Thanks -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Wednesday, May 29, 2013 at 9:43 PM, bbarani wrote: # has a separate meaning in URL.. You need to encode that..

Re: Sorting results by last update date

2013-05-30 Thread Kamal Palei
Thanks Shalini... It is solr 3.6.2 Instead of NOW, I can use today's date (I did not know this cache issue,, thanks). Later I realized , it looks it is my mistake that misleads asc and desc ordering result. After I get data from solr, again I do mysql query where the order changes again.

Re: Sorting results by last update date

2013-05-30 Thread Tom Gullo
sort=last_updated_date desc Maybe adding %20 will help: sort=last_updated_date%20desc -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-results-by-last-update-date-tp4066692p4066986.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread Alexandre Rafalovitch
Did you declare that field name in outer entity? Not just select as in the query. Regards, Alex On 30 May 2013 04:31, jerome.dup...@bnf.fr wrote: Hello, I want to use a index a huge list of xml file. _ Using FileListEntityProcessor causes an OutOfMemoryException (too many files...)

Re: Automatic cross linking

2013-05-30 Thread Alexandre Rafalovitch
Do it outside of solr or look at update request processors. E.g. UIMA integration as an example. Regards, Alex On 30 May 2013 02:52, It-forum it-fo...@meseo.fr wrote: Hello, I'm looking to use Solr for creating cross linking in text. For exemple : I'll like to be able to request for a

SPLITSHARD: time out error

2013-05-30 Thread yriveiro
Hi, I have a time out error when I try to split a collection with 15M documents The exception (solr version 4.3): 542468 [catalina-exec-27] INFO org.apache.solr.servlet.SolrDispatchFilter – [admin] webapp=null path=/admin/collections

Re: multiple field join?

2013-05-30 Thread Erick Erickson
Solr Join is _not_ sql subquery and won't work like one. There's a reason it's called pseudo join in the JIRA issues. My advice. Forget joins and try to write this in pure Solr query language. The more you try to use Solr like a database, the more you'll get into trouble. De-normalize your data

Re: Not able to search Spanish word with ascent in solr

2013-05-30 Thread Erick Erickson
Deep: Have you looked through the rest of the thread and tried the suggestions? If so, what were the results? Best Erick On Thu, May 30, 2013 at 2:45 AM, Deep Lotia deeplo...@gmail.com wrote: Hi, I am having a same kind of issue. I am not able to search accented characters of spanish. For

Removing a single value from a multiValue field

2013-05-30 Thread Dotan Cohen
I have a Solr application with a multiValue field 'tags'. All fields are indexed in this application. There exists a uniqueKey field 'id' and a '_version_' field. This is running on Solr 4.x. In order to add a tag, the application retrieves the full document, creates a PHP array from the document

Re: Problem with PatternReplaceCharFilter

2013-05-30 Thread Jack Krupansky
Just count the character in the literal portions of the patterns and include that spaces in the replacement. So, TextLine would become . It gets trickier if names are variable length. But I'm sure you could come up with patterns to replace one, two, three, etc. char names with

Re: Support for Mongolian language

2013-05-30 Thread Jack Krupansky
No, there is not. -- Jack Krupansky -Original Message- From: Sagar Chaturvedi Sent: Thursday, May 30, 2013 3:03 AM To: solr-user@lucene.apache.org Subject: RE: Support for Mongolian language I have already checked this link. Could not find any hint about Mongolian language. Is there

RE: Upgrade Solr index from 4.0 to 4.2.1

2013-05-30 Thread Elran Dvir
So having tried all combinations of LUCENE_40, 41 and 42 we're still having no success in getting our indexes to load with Solr 4.2.1... Any direction we can look into ? in our system the underlying data is very slow to re-index and would take an unreasonable amount of time at a customer site

Re: Sorting results by last update date

2013-05-30 Thread Jack Krupansky
You can just use NOW/DAY for a filter that would only change once a day: [NOW/DAY-60DAY TO NOW/DAY] Oops... make that: [NOW/DAY-60DAY TO NOW/DAY+1DAY] Otherwise, it would miss dates after the start of today. Even better, make it: [NOW/DAY-60DAY TO *] -- Jack Krupansky -Original

Re: Reindexing strategy

2013-05-30 Thread Dotan Cohen
On Wed, May 29, 2013 at 5:37 PM, Shawn Heisey s...@elyograg.org wrote: It's impossible for us to give you hard numbers. You'll have to experiment to know how fast you can reindex without killing your servers. A basic tenet for such experimentation, and something you hopefully already know:

Re: What exactly happens to extant documents when the schema changes?

2013-05-30 Thread Dotan Cohen
On Wed, May 29, 2013 at 5:09 PM, Shawn Heisey s...@elyograg.org wrote: I handle this in a very specific way with my sharded index. This won't work for all designs, and the precise procedure won't work for SolrCloud. There is a 'live' and a 'build' core for each of my shards. When I want to

Re: Removing a single value from a multiValue field

2013-05-30 Thread Jack Krupansky
First, you cannot do any internal editing of a multi-valued list, other than: 1. Replace the entire list. 2. Add values on to the end of the list. But you can do both of those operations on a single multivalued field with atomic update without reading and writing the entire document.

Re: Sorting results by last update date

2013-05-30 Thread Jack Krupansky
I wrote Otherwise, it would miss dates after the start of today, but that should be Otherwise, it would miss documents with times after the start of today if the current time is before noon. But use * and you will be better off anyway. -- Jack Krupansky -Original Message- From: Jack

Re: Problem with xpath expression in data-config.xml

2013-05-30 Thread Shalin Shekhar Mangar
Ah, I missed that part. The problem that you have is because you have forEach=/feed/entry but you want to read /feed/link as a common field. You need to have forEach=/feed | /feed/entry which should let you have both /feed/link as well as /feed/entry/link. On Thu, May 30, 2013 at 1:25 PM,

Re: Removing a single value from a multiValue field

2013-05-30 Thread Dotan Cohen
On Thu, May 30, 2013 at 3:42 PM, Jack Krupansky j...@basetechnology.com wrote: First, you cannot do any internal editing of a multi-valued list, other than: 1. Replace the entire list. 2. Add values on to the end of the list. Thank you. I meant that I am actually editing the entire

Re: Removing a single value from a multiValue field

2013-05-30 Thread Jack Krupansky
You gave an XML example, so I assumed you were working with XML! In JSON... [{id: doc-id, tags: {add: [a, b]}] and [{id: doc-id, tags: {set: null}}] BTW, this kind of stuff is covered in the book, separate chapters for XML and JSON, each with dozens of examples like this. -- Jack

Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Jonathan Rochkind
I am trying to get Solr installed in Tomcat, and having trouble. I am trying to use the instructions at http://wiki.apache.org/solr/SolrTomcat as a guide. Trying to start with the example Solr from the Solr distro. Tried using the Tried with both a binary distro with existing solr.war, and

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Steve Rowe
Hi Jonathan, Did you find http://stackoverflow.com/questions/3016808/tomcat-startup-logs-severe-error-filterstart-how-to-get-a-stack-trace ? Steve On May 30, 2013, at 10:10 AM, Jonathan Rochkind rochk...@jhu.edu wrote: I am trying to get Solr installed in Tomcat, and having trouble. I am

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Alexandre Rafalovitch
Usually tomcat errors with Solr 4.3 happen due to uncopied logging libraries. I would check if installing Solr 4.2.1 works and/or copy additional libraries in (search mailing list for this issue). However, I am not entirely sure that's the case here. It feels that perhaps the definition of the

Fwd: indexing only selected fields

2013-05-30 Thread Igor Littig
-- Forwarded message -- From: Igor Littig igor.lit...@gmail.com Date: 2013/5/30 Subject: indexing only selected fields To: solr-user-...@lucene.apache.org Hello everyone. I'm quite new in Solr and need your advice... Does anybody know how to index not all fields in an uploading

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Shawn Heisey
I am trying to get Solr installed in Tomcat, and having trouble. When I start up tomcat, I get in the Tomcat log: INFO: Deploying web application archive solr.war May 29, 2013 3:59:40 PM org.apache.catalina.core.StandardContext start SEVERE: Error filterStart May 29, 2013 3:59:40 PM

Re: indexing only selected fields

2013-05-30 Thread Alexandre Rafalovitch
How are you submitting your document? Some methods automatically ignore unknown fields, other complaint. In any case, there is always a way to define an ignored field type. The schema.xml in the main example shows how to do it. Search for 'ignored'. But beware that this will hide all spelling and

Re: Fwd: indexing only selected fields

2013-05-30 Thread Shawn Heisey
-- Forwarded message -- From: Igor Littig igor.lit...@gmail.com Date: 2013/5/30 Subject: indexing only selected fields To: solr-user-...@lucene.apache.org Hello everyone. I'm quite new in Solr and need your advice... Does anybody know how to index not all fields in an

Re: indexing only selected fields

2013-05-30 Thread Igor Littig
Alex Thank you for the answer. I am submitting by POST method via curl... For example when I want to submit a document I'm typing in the command line: curl 'http://localhost:8983/solr/update/json?commit=true' --data-binary @ base.info -H 'Content-type:application/json' where base.info my file

RE: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread Dyer, James
I don't want to dissuade you from trying but I believe FileListEntityProcessor has something special coded up into it to allow for its unique usage. Not sure if your approach isn't do-able. I would imagine that fixing FLEP to handle a row-at-a-time or page-at-a-time in memory wouldn't be

Re: Fwd: indexing only selected fields

2013-05-30 Thread Jack Krupansky
Update Request Processors to the rescue! Example - Ignore input values for any undefined fields Add to solrconfig: updateRequestProcessorChain name=ignore-undefined processor class=solr.IgnoreFieldUpdateProcessorFactory / processor class=solr.LogUpdateProcessorFactory / processor

Re: indexing only selected fields

2013-05-30 Thread Alexandre Rafalovitch
If you want to just removing anything that does not match then 'ignored' field type in example schema would work. If you want to ignore specific fields but complain on any unexpected things you can still use specific fields but with ignored type. Or you could use Update Request Processors like

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Jonathan Rochkind
Thanks! I guess I should have asked on-list BEFORE wasting 4 hours fighting with it myself, but I was trying to be a good user and do my homework! Oh well. Off to the logging instructions, hope I can figure them out -- if you could update the tomcat instructions with the simplest possible

Re: SPLITSHARD: time out error

2013-05-30 Thread Shalin Shekhar Mangar
Shard splitting is buggy in 4.3. I recommend that you wait for the next release (4.3.1) before using this feature. That being said, the split is executed by the Overseer and will continue to happen even after the http request times out. There aren't enough hooks to monitor the progress of the

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Jonathan Rochkind
I'm going to add a note to http://wiki.apache.org/solr/SolrLogging , with the Tomcat sample Error filterStart error, as an example of something you might see if you have not set up logging. Then at least in the future, googling solr tomcat error filterStart might lead someone to the clue that

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Shawn Heisey
On 5/30/2013 9:26 AM, Jonathan Rochkind wrote: Thanks! I guess I should have asked on-list BEFORE wasting 4 hours fighting with it myself, but I was trying to be a good user and do my homework! Oh well. Off to the logging instructions, hope I can figure them out -- if you could update the

Re: Re: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread jerome . dupont
Hi, Thanks for your anwser, it made me go ahead. The name of the entity was not good, not consistent with schema Now the first entity works fine: the query is done to the database and returns the good result. The problem is that the second entity, which is a XPathEntityProcessor entity, doesn't

Pivot Facets refining datetime, bleh

2013-05-30 Thread Andrew Muldowney
I've been trying to get into how distributed field facets do their work but I haven't been able to uncover how they deal with this issue. Currently distrib pivot facets does a getTermCounts(first_field) to populate a list at the level its working on. When putting together the data structure we

Re: Re: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread Alexandre Rafalovitch
On Thu, May 30, 2013 at 11:44 AM, jerome.dup...@bnf.fr wrote: entity name=processorDocument processor=XPathEntityProcessor datasource=racineNoticeDatasource

Re: indexing only selected fields

2013-05-30 Thread Igor Littig
Ok, that is clear. Thanks fo the answer 2013/5/30 Alexandre Rafalovitch arafa...@gmail.com If you want to just removing anything that does not match then 'ignored' field type in example schema would work. If you want to ignore specific fields but complain on any unexpected things you can

Rollback from Solr4.2.1 to Solr3.5

2013-05-30 Thread adityab
Hi, We recently had production release to upgrade our Solr3.5 to Solr 4.2.1. (No schema change except the some basic required for 4.2.1) The nature of our document is that we have huge multivalued fields. they can go from 1000 to 100K in once single field. # Documents : 300K # Index size: 9GB

Re: java.lang.IllegalAccessError when invoking protected method from another class in the same package path but different jar.

2013-05-30 Thread bbarani
Hoss, thanks a lot for the explanation. We override most of the methods of query component(prepare,handleResponses,finishStage etc..) to incorporate custom logic and we set the _responseDocs values based on custom logic (after filtering out few data) and then we call the parent(super)

Re: solr 4.3: write.lock is not removed

2013-05-30 Thread bbarani
How are you indexing the documents? Are you using indexing program? The below post discusses the same issue.. http://lucene.472066.n3.nabble.com/removing-write-lock-file-in-solr-after-indexing-td3699356.html -- View this message in context:

Continue Indexing Documents when single doc does not match schema

2013-05-30 Thread Iain Lopata
I am using Nutch 1.6 and Solr 1.4.1 on Ubuntu in local mode and using Nutch's solrindex to index documents into Solr. When indexing documents, I hit an occasional document that does not match the Solr schema. For example, a document which has two address fields when my Solr schema.xml does

solr 3.6 use only one CPU

2013-05-30 Thread Mingfeng Yang
We have a solr instance running on a 4 CPU box. Sometimes, we send a query to our solr server and it take up 100% of one CPU and 60% of memory. I assume that if we send another query request, solr should be able to use another idling CPU. However, it is not the case. Using top, I only see

Continue Indexing Documents when single doc does not match schema

2013-05-30 Thread Iain Lopata
I am using Nutch 1.6 and Solr 1.4.1 on Ubuntu in local mode and using Nutch's solrindex to index documents into Solr. When indexing documents, I hit an occasional document that does not match the Solr schema. For example, a document which has two address fields when my Solr schema.xml does

Re: Continue Indexing Documents when single doc does not match schema

2013-05-30 Thread Shawn Heisey
On 5/30/2013 11:03 AM, Iain Lopata wrote: When indexing documents, I hit an occasional document that does not match the Solr schema. For example, a document which has two address fields when my Solr schema.xml does not specify address as being multi-valued (and I do not want it to be).

Re: solr 3.6 use only one CPU

2013-05-30 Thread Shawn Heisey
On 5/30/2013 11:12 AM, Mingfeng Yang wrote: We have a solr instance running on a 4 CPU box. Sometimes, we send a query to our solr server and it take up 100% of one CPU and 60% of memory. I assume that if we send another query request, solr should be able to use another idling CPU. However,

Find rows within range of other rows

2013-05-30 Thread Mike Ree
I need to do a query where I need to find all people who have done 2 events within a range. I currently log one row per an event. Example: Person,Date,ViewedUrl 1,2012May10,google.com 2,2012May10,yahoo.com 1,2012May13,yahoo.com 2,2012May13,google.com Sample request would be wanting to find all

Re: Continue Indexing Documents when single doc does not match schema

2013-05-30 Thread Alexandre Rafalovitch
On Thu, May 30, 2013 at 1:03 PM, Iain Lopata ilopa...@hotmail.com wrote: For example, a document which has two address fields when my Solr schema.xml does not specify address as being multi-valued (and I do not want it to be). No help on the core topic, but a workaround for the specific

RE: solr 4.3: write.lock is not removed

2013-05-30 Thread Zhang, Lisheng
Hi, We just use CURL from PHP code to submit indexing request, like: /update?commit=true.. This worked well in solr 3.6.1. I saw the link you showed and really appreciate (if no other choice I will change java source code but hope there is a better way..)? Thanks very much for helps, Lisheng

RE: solr 4.3: write.lock is not removed

2013-05-30 Thread Zhang, Lisheng
I did more tests and get more info: the basic setting is that we created core from PHP CURl API where we define: schema config instanceDir=my_solr_home dataDir=my_solr_home/data/new_collection_name In solr 3.6.1 we donot need to define schema/config because conf folder is not inside each

Re: multiple field join?

2013-05-30 Thread Chris Hostetter
: My advice. Forget joins and try to write this in pure : Solr query language. The more you try to use Solr like : a database, the more you'll get into trouble. De-normalize : your data and try again. with that important caveat in mind, it is worth noting that what you are essentailly asking

Re: solr 4.3: write.lock is not removed

2013-05-30 Thread Chris Hostetter
: I recently upgraded solr from 3.6.1 to 4.3, it works well, but I noticed that after finishing : indexing : : write.lock : : is NOT removed. Later if I index again it still works OK. Only after I shutdown Tomcat : then write.lock is removed. This behavior caused some problem like I could

Re: Grouping results based on the field which matched the query

2013-05-30 Thread Chris Hostetter
: I wanted to know if Solr has some functionality to group results based on : the field that matched the query. : : So if I have id, name and manufacturer in my document structure, I want to : know how many results are there because its manufacturer matched the q and : how many results are there

Collections API Reload killing my cloud

2013-05-30 Thread davers
Everytime I try to do a reload using the collections API my entire cloud goes down and I cannot search it. The solrconfig.xml and schema.xml are good because when I just restart tomcat everything works fine. Here is the output of the collections api reload command: 59155087

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Jonathan Rochkind
Okay, sadly, i still can't get this to work. Following the instructions at: https://wiki.apache.org/solr/SolrLogging#Using_the_example_logging_setup_in_containers_other_than_Jetty I copied solr/example/lib/ext/*.jar into my tomcat's ./lib, and copied solr/example/resources/log4j.properties

Re: Collections API Reload killing my cloud

2013-05-30 Thread Mark Miller
https://issues.apache.org/jira/browse/SOLR-4805 - Mark On May 30, 2013, at 3:09 PM, davers dboych...@improvementdirect.com wrote: Everytime I try to do a reload using the collections API my entire cloud goes down and I cannot search it. The solrconfig.xml and schema.xml are good because when

Re: Collections API Reload killing my cloud

2013-05-30 Thread davers
Is it possible that this has something do do with it? 59157032 [Thread-2] INFO org.apache.solr.cloud.Overseer – Update state numShards=null message={ numShards=null -- View this message in context:

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Shawn Heisey
On 5/30/2013 1:19 PM, Jonathan Rochkind wrote: Okay, sadly, i still can't get this to work. Following the instructions at: https://wiki.apache.org/solr/SolrLogging#Using_the_example_logging_setup_in_containers_other_than_Jetty I copied solr/example/lib/ext/*.jar into my tomcat's ./lib, and

SolrCloud running away with resources

2013-05-30 Thread ltenny
I've set up a simple 10 node, 5 shard SolrCloud 4.3. I'm pushing just a few thousand documents into it. What I'm doing is rather write intensive 100x...more writes than reads. I've noticed that there seems to be an unbounded use of resources. I'm seeing a steadily increasing number of network

Re: Solr 4.3, Tomcat, Error filterStart

2013-05-30 Thread Jonathan Rochkind
Okay, for posterity: I did manage to get it working. It WAS lack of the logging files. First, the only way I could manage to get Tomcat6 to log an actual stacktrace for the Error filterStart was to _delete_ my CATALINA_HOME/conf/logging.properties file. Apparently without this file at all,

indexing documents

2013-05-30 Thread Igor Littig
Good day everyone. I recently faced another problem. I've got a bunch of documents to index. The problem, that they in the same time database for another application. These documents stored in JSON format in the following scheme: { id: 10, name: dad 177, cat:[{ id:254, name:124 }]

Re: 2 VM setup for SOLRCLOUD?

2013-05-30 Thread Jason Hellman
Jamey, You will need a load balancer on the front end to direct traffic into one of your SolrCore entry points. It doesn't matter, technically, which one though you will find benefits to narrowing traffic to fewer (for purposes of better cache management). Internally SolrCloud will

2 VM setup for SOLRCLOUD?

2013-05-30 Thread James Dulin
Working to setup SolrCloud in Windows Azure. I have read over the solr Cloud wiki, but am a little confused about some of the deployment options. I am attaching an image for what I am thinking we want to do. 2 VM's that will have 2 shards spanning across them. 4 Nodes total across the two

RE: solr 4.3: write.lock is not removed

2013-05-30 Thread Zhang, Lisheng
Hi, Thanks very much for the explanation! Could we config to get to old behavior? I asked this option because our app has many small cores so that we prefer create/close writer on the fly (otherwise we may have memory issue quickly). We also do not need NRT for now. Thanks very much for helps,

RE: solr starting time takes too long

2013-05-30 Thread Zhang, Lisheng
Hi Eric, Thanks very much for helps (I should have responded sooner): 1/ My problem in 3.6 turned out to be much related to the fact I did not share schema, after using shareSchema, the start time is reduced up to 80% (to my great surprise, previously I thought burden is most in

Re: OPENNLP problems

2013-05-30 Thread Lance Norskog
I will look at these problems. Thanks for trying it out! Lance Norskog On 05/28/2013 10:08 PM, Patrick Mi wrote: Hi there, Checked out branch_4x and applied the latest patch LUCENE-2899-current.patch however I ran into 2 problems Followed the wiki page instruction and set up a field with

RE: solr 4.3: write.lock is not removed

2013-05-30 Thread Zhang, Lisheng
I did more test and it seems that this is still a bug (previous issue 3/): 1/ Create a core by CURL command with dataDir=some_folder, core is created OK and later indexing worked OK also. 2/ But in solr.xml, dadaDir is not defined in element core 3/ After restart solr, dataDir

Strip HTML Tags and Store

2013-05-30 Thread Kalyan Kuram
Hi AllI am trying to understand what gets stored when i configure a field indexed and stored for example i have this in my schema.xmlfield name=articleBody type=text_general indexed=true stored=true /and fieldType name=text_general class=solr.TextField positionIncrementGap=100

Re: Reindexing strategy

2013-05-30 Thread Michael Sokolov
On 5/30/2013 8:30 AM, Dotan Cohen wrote: On Wed, May 29, 2013 at 5:37 PM, Shawn Heisey s...@elyograg.org wrote: It's impossible for us to give you hard numbers. You'll have to experiment to know how fast you can reindex without killing your servers. A basic tenet for such experimentation, and

RE: Support for Mongolian language

2013-05-30 Thread Sagar Chaturvedi
What would be the steps if we want to use Mongolian or any other language that is not supported? -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Thursday, May 30, 2013 5:43 PM To: solr-user@lucene.apache.org Subject: Re: Support for Mongolian language No,

Re: Strip HTML Tags and Store

2013-05-30 Thread Jack Krupansky
Update Request Processors to the rescue again. Namely, the HTML Strip Field Update processor: Add to your solrconfig: updateRequestProcessorChain name=html-strip-features processor class=solr.HTMLStripFieldUpdateProcessorFactory str name=fieldNamefeatures/str /processor

Re: Support for Mongolian language

2013-05-30 Thread Alexandre Rafalovitch
Well, you would need a tokenizer, probably a stemmer, a list of stop-words (to ignore). Is the original text in UTF8 or is it in some alternative encoding. A quick search showed that there is an academic paper where they are trying to work with Mongolian to get it into Lucene. It seems quite

RE: Support for Mongolian language

2013-05-30 Thread Sagar Chaturvedi
Thanks Alexandre for the link. It was really helpful. The original text will be in UTF-8. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, May 31, 2013 8:41 AM To: solr-user@lucene.apache.org Subject: Re: Support for Mongolian language Well, you

Re: Support for Mongolian language

2013-05-30 Thread Jack Krupansky
Try using the text_general field type and see how reasonable or unreasonable the standard tokenizer is at identifying reasonable word breaks for some sample Mongolian text. Use the Solr Admin UI Analyzer page to see what the various term analysis filters output. -- Jack Krupansky

RE: Support for Mongolian language

2013-05-30 Thread Sagar Chaturvedi
Hi, On solr admin UI, in a query I am trying to highlight some fields. I have set hl = true, given name of comma separated fields in hl.fl but fields are not getting highlighted. Any insights? Regards, Sagar DISCLAIMER:

Highlighting fields

2013-05-30 Thread Sagar Chaturvedi
Sorry for wrong subject. Corrected it. -Original Message- From: Sagar Chaturvedi [mailto:sagar.chaturv...@nectechnologies.in] Sent: Friday, May 31, 2013 11:25 AM To: solr-user@lucene.apache.org Subject: RE: Support for Mongolian language Hi, On solr admin UI, in a query I am trying to