Re: Sending Documents via SolrServer as MapReduce Jobs at Solrj

2013-07-05 Thread Furkan KAMACI
is it better to require another large software system (Hadoop), when it works fine without it? That just sounds like more stuff to configure, misconfigure, and cause problems with indexing. wunder On Jul 5, 2013, at 4:48 AM, Furkan KAMACI wrote: We are using Nutch to crawl web sites

Document count mismatch

2013-07-09 Thread Furkan KAMACI
I've run a command to find term counts at my index: solr/select/?q=*:*rows=0facet=onfacet.field=tenowt=xmlindent=on it gives me a result like that: ... result name=response numFound=3245092 start=0 maxScore=1.0/result ... lst name=teno int name=lev3107206/int int name=tenu59821/int ... when I

Re: Document count mismatch

2013-07-09 Thread Furkan KAMACI
of returned facet values to a larger or smaller value than the default of 100. 3. Try reading the Faceting chapter of my book! -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Tuesday, July 09, 2013 8:09 AM To: solr-user@lucene.apache.org Subject: Document count mismatch

Re: Document count mismatch

2013-07-09 Thread Furkan KAMACI
$MergeSortQueue.init(TopDocs.java:143) at org.apache.lucene.search.TopDocs.merge(TopDocs.java:214) ... 2013/7/9 Jack Krupansky j...@basetechnology.com I don't quite follow the question. Give us an example. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Tuesday, July 09

Re: Switch to new leader transparently?

2013-07-10 Thread Furkan KAMACI
You can define a CloudSolrServer as like that: *private static CloudSolrServer solrServer;* and then define the addres of your zookeeper host: *private static String zkHost = localhost:9983;* initialize your variable: *solrServer = new CloudSolrServer(zkHost);* You can get leader list as

Re: Switch to new leader transparently?

2013-07-10 Thread Furkan KAMACI
You can check the source code of LBHttpSolrServer and try to implement something like that as your own. 2013/7/10 Floyd Wu floyd...@gmail.com Hi anshum Thanks for your response. My application is developed using C#, so I can't use CloudSolrServer with SolrJ. My problem is there is a

Re: Switch to new leader transparently?

2013-07-10 Thread Furkan KAMACI
, how to fetch/get cluster state from zk directly in plain http or tcp socket? In my SolrCloud cluster, I'm using standalone zk to coordinate. Floyd 2013/7/10 Furkan KAMACI furkankam...@gmail.com You can define a CloudSolrServer as like that: *private static CloudSolrServer solrServer

Re: Usage of CloudSolrServer?

2013-07-12 Thread Furkan KAMACI
CloudSolrServer uses LBHttpSolrServer by default. CloudSolrServer connects to Zookeeper and passes the live nodes to LBHttpSolrServer. LBHttpSolrServer connects each node as round robin. By the way do you mean leader instead of master? 2013/7/12 sathish_ix skandhasw...@inautix.co.in Hi , Iam

Re: Leader Election, when?

2013-07-12 Thread Furkan KAMACI
If you want to plan to have 2 shards and if you start up the first node it will be the leader of first shard. When you start up second node it will be the leader of second shard. If you start up third node it will be the replica of first shard. If you start up fourth node it will be the replica of

Re: solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper

2013-07-12 Thread Furkan KAMACI
If you have one collection you just need to define hostnames of Zookeeper ensembles and run that command once. 2013/7/11 Zhang, Lisheng lisheng.zh...@broadvision.com Hi, We are testing solr 4.3.0 in Tomcat (considering upgrading solr 3.6.1 to 4.3.0), in WIKI page for solrCloud in Tomcat:

Re: How to set a condition on the number of docs found

2013-07-12 Thread Furkan KAMACI
Do you want to modify Solr source code? Did you check that line at XMLWriter.java : *writeAttr(numFound,Long.toString(numFound));* 2013/7/12 Matt Lieber mlie...@impetus.com Hello there, I would like to be able to know whether I got over a certain threshold of doc results. I.e. Test

Does Solrj Batch Processing Querying May Confuse?

2013-07-12 Thread Furkan KAMACI
I've crawled some webpages and indexed them at Solr. I've queried data at Solr via Solrj. url is my unique field and I've define my query as like that: ModifiableSolrParams params = new ModifiableSolrParams(); params.set(q, lang:tr); params.set(fl, url); params.set(sort, url desc); I've run my

Re: preferred container for running SolrCloud

2013-07-13 Thread Furkan KAMACI
Of course you may have some reasons to use Tomcat or anything else (i.e. your stuff may have more experience at Tomcat etc.) However developers generally runs Jetty because it is default for Solr and I should point that Solr unit tests run against jetty (in fact, a specific version of Jetty) and

How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?

2013-07-15 Thread Furkan KAMACI
When I search something which has non ASCII characters at Google it returns me results both original and ascified versions and *highlights both of them*. For example if I search *çiğli* at Google first result is that: *Çiğli* Belediyesi www.*cigli*.bel.tr/ How can I do that at Solr? How can I

Re: Switching to using SolrCloud with tomcat7 and embedded zookeeper

2013-07-17 Thread Furkan KAMACI
If you are not defining global Java startup parameters do not include them at setenv.sh. Pass that arguments as parameters when you startup your jar. 2013/7/17 smanad sma...@gmail.com Originally i was running a single solr 4.3 instance with 4 cores ... and now starting to learn about

Why Sort Doesn't Work?

2013-07-17 Thread Furkan KAMACI
I run a query at my Solr 4.2.1 SolrCloud: /solr/select?q=*:*rows=300wt=csvfl=urlsort=url asc result is as follows: http://goethetc.blogspot.com/ http://about.deviantart.com/contact/ http://browse.deviantart.com/designbattle/ http://browse.deviantart.com/digitalart/

Re: Why Sort Doesn't Work?

2013-07-17 Thread Furkan KAMACI
Hi Markus; This is default schema at Nutch. Do you mean there is a bug with schema? 2013/7/17 Markus Jelsma markus.jel...@openindex.io Remove the WDF from the analysis chain, it's not going to work with multiple tokens. -Original message- From:Furkan KAMACI

Re: Why Sort Doesn't Work?

2013-07-17 Thread Furkan KAMACI
Hi Markus; What is that score? It is not listed at schema. Is it document boost? 2013/7/17 Markus Jelsma markus.jel...@openindex.io No, there is no bug in the schema, it is just an example and provides the most common usage only; sort by score. -Original message- From:Furkan

Does early EOF Results With Document Loss To Index?

2013-07-17 Thread Furkan KAMACI
At my indexing process to my SolrCloud(Solr 4.2.1) from Hadoop I got an error. What is the reason, does it results with document loss for indexing? ERROR - 2013-07-17 16:30:01.453; org.apache.solr.common.SolrException; java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException]

How can I learn the total count of how many documents indexed and how many documents updated?

2013-07-17 Thread Furkan KAMACI
I have crawled some web pages and indexed them at my SolrCloud(Solr 4.2.1). However before I index them there was already some indexes. I can calculate the difference between current and previous document count. However it doesn't mean that I have indexed that count of documents. Because urls of

Re: How can I learn the total count of how many documents indexed and how many documents updated?

2013-07-17 Thread Furkan KAMACI
I will open a Jira for it and apply a patch, thanks. 2013/7/17 Jack Krupansky j...@basetechnology.com I don't think that breakdown is readily available from Solr. Sounds like a good Jira request for improvement in the response. -- Jack Krupansky -Original Message- From: Furkan

Re: How can I learn the total count of how many documents indexed and how many documents updated?

2013-07-18 Thread Furkan KAMACI
Heisey s...@elyograg.org On 7/17/2013 8:06 AM, Furkan KAMACI wrote: I have crawled some web pages and indexed them at my SolrCloud(Solr 4.2.1). However before I index them there was already some indexes. I can calculate the difference between current and previous document count. However

Re: How can I learn the total count of how many documents indexed and how many documents updated?

2013-07-18 Thread Furkan KAMACI
...@elyograg.org On 7/17/2013 8:06 AM, Furkan KAMACI wrote: I have crawled some web pages and indexed them at my SolrCloud(Solr 4.2.1). However before I index them there was already some indexes. I can calculate the difference between current and previous document count. However

IDNA Support For Solr

2013-07-19 Thread Furkan KAMACI
Hi; Is there any support for IDNA at Solr? (IDNA: http://en.wikipedia.org/wiki/Internationalized_domain_name)

Re: IDNA Support For Solr

2013-07-19 Thread Furkan KAMACI
I mean that: there is a web adress: *çorba.com http://xn--orba-zoa.com* However its IDNA coded version is: *xn--orba-zoa.com* You can check it from here: * http://www.whois.com.tr/?q=%C3%A7orbasldtld=com* Let's assume that I've indexed a web page with that URL: *xn--orba-zoa.com*and one

Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Furkan KAMACI
Hi; Sometimes a huge part of a document may exist in another document. As like in student plagiarism or quotation of a blog post at another blog post. Does Solr/Lucene or its libraries (UIMA, OpenNLP, etc.) has any class to detect it?

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Furkan KAMACI
Furkan KAMACI furkankam...@gmail.com Hi; Sometimes a huge part of a document may exist in another document. As like in student plagiarism or quotation of a blog post at another blog post. Does Solr/Lucene or its libraries (UIMA, OpenNLP, etc.) has any class to detect it?

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Furkan KAMACI
the first, say, 2-3 hits 4. mark it as quote / plagiarism 5. eventually train a classifier to help you mark other texts as quote / plagiarism HTH, Tommaso 2013/7/23 Furkan KAMACI furkankam...@gmail.com Actually I need a specialized algorithm. I want to use that algorithm to detect duplicate

WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Furkan KAMACI
Hi; I have indexed wikipedia data with Solr DIH. However when I look data that is indexed at Solr I something like that as well: {| style=text-align: left; width: 50%; table-layout: fixed; border=0 |- valign=top | style=width: 50%| :*[[Ubuntu]] :*[[Fedora]] :*[[Mandriva]] :*[[Linux Mint]]

Re: WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Furkan KAMACI
? You should just see the text tokens plus the URLs for links. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Tuesday, July 23, 2013 10:53 AM To: solr-user@lucene.apache.org Subject: WikipediaTokenizer for Removing Unnecesary Parts Hi; I have indexed wikipedia

Usage Of Real Time Get Handler Of Solr

2013-07-24 Thread Furkan KAMACI
Hi; There is a real time get handler at Solr: !-- realtime get handler, guaranteed to return the latest stored fields of any document, without the need to commit or open a new searcher. The current implementation relies on the updateLog feature being enabled. -- requestHandler

How to Read Solr Admin Stats Page?

2013-07-24 Thread Furkan KAMACI
I am indexing and check the admin stats page. I see that: commits:471 autocommit maxTime:15000ms autocommits:414 soft autocommits:0 optimizes:12 docsPending:388 adds:305 cumulative_adds:2154245

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-25 Thread Furkan KAMACI
for you :) roman On Tue, Jul 23, 2013 at 11:07 AM, Shashi Kant sk...@sloan.mit.edu wrote: Here is a paper that I found useful: http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf On Tue, Jul 23, 2013 at 10:42 AM, Furkan KAMACI furkankam...@gmail.com wrote: Thanks

Difference between qf and pf parameters

2013-07-26 Thread Furkan KAMACI
Here is an example from example solrconfig file: str name=qfcontent^0.5 anchor^1.0 title^1.2/str str name=pfcontent^0.5 anchor^1.5 title^1.2 site^1.5/str What is the difference between qf and pf parameters, they both boost fields both there should be a difference?

Highlight Problem

2013-07-26 Thread Furkan KAMACI
This was at example solrconfig file: requestHandler name=/browse class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.01/float str name=qfcontent^0.5 anchor^1.0 title^1.2/str str

Re: Highlight Problem

2013-07-26 Thread Furkan KAMACI
Ok, I've found that there was not a problem at config. 2013/7/26 Furkan KAMACI furkankam...@gmail.com This was at example solrconfig file: requestHandler name=/browse class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str

Exact Match

2013-07-26 Thread Furkan KAMACI
When I run that query: solr/select?q=url:ftp://wt=xmlfl=url I get results as like that: result name=response numFound=1441 start=0 maxScore=9.5640335 docstr name=urlhttp://forum.whmdestek.com/ftp-makaleleri//str/doc docstr name=urlhttp://www.netadi.com/ftp-kurulumu.php/str/doc Why it does not

Synonym Phrase

2013-07-26 Thread Furkan KAMACI
I have a synonyms file as like that: cart; shopping cart; market trolley When I analyse my query I see that when I search cart these becomes synonyms: cart, shopping, market, trolley so cart is synonym with shopping. How should I define my synonyms.txt file that it will understand that cart

How to Make That Domains Should Be First?

2013-07-26 Thread Furkan KAMACI
When I search wikipedia the home page of wikipedia is not at first result: http://www.wikipedia.org/ first result is that: http://en.wikipedia.org/wiki/Spain How can I say that domains of web sites should be first at SolrCloud? (I want something like grouping at domains and boosting at url

Re: Synonym Phrase

2013-07-26 Thread Furkan KAMACI
be as simple as scanning for the synonym phrases and then adding OR terms for the synonym phrases. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Friday, July 26, 2013 10:53 AM To: solr-user@lucene.apache.org Subject: Synonym Phrase I have a synonyms file as like

Re: Synonym Phrase

2013-07-26 Thread Furkan KAMACI
Should I re write it as like that: shopping cart = market trolley, cart or somethinglike that? 2013/7/26 Furkan KAMACI furkankam...@gmail.com Why Solr does not split that terms by*;* I think that it both split by *;* and white space character? 2013/7/26 Jack Krupansky j

Exact Search Problem

2013-07-26 Thread Furkan KAMACI
Let's assume that I have that urls at my index: www.abc.com www.abc.com/a www.abc.com/b www.abc.com/c ... www.abc.com/x How can I exact search for www.abc.com ? url:www.abc.com doesn't works because it returns both www.abc.com/a, www.abc.com/b etc?

Query Performance

2013-07-28 Thread Furkan KAMACI
What is the difference between: q=*:*rows=row_countsort=id asc and q={X TO *}rows=row_countsort=id asc Does the first one trys to get all the documents but cut the result or they are same or...? What happens at underlying process of Solr for that two queries?

Re: Query Performance

2013-07-28 Thread Furkan KAMACI
row_count might be. Generally, you should only sort a small number of documents/results. Or, consider DocValues since they are designed for sorting. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Sunday, July 28, 2013 5:06 PM To: solr-user@lucene.apache.org Subject

RAM Usage Debugging

2013-07-29 Thread Furkan KAMACI
When I look at my dashboard I see that 27.30 GB available for JVM, 24.77 GB is gray and 16.50 GB is black. I don't do anything on my machine right now. Did it cache documents or is there any problem, how can I learn it?

AND Queries

2013-07-29 Thread Furkan KAMACI
I am searching for a keyword as like that: lang:en AND url:book pencil cat It returns me results however none of them includes both book, pencil and cat keywords. How should I rewrite my query? I tried this: lang:en AND url:(book AND pencil AND cat) and looks like OK. However this not:

Re: AND Queries

2013-07-29 Thread Furkan KAMACI
. So, lang:en AND url:book AND pencil AND cat is interpreted as : ang:en AND url:book AND default_field:pencil AND default_field:cat The default search field is defined in your schema.xml file (defaultSearchField) Franck Brisbart Le lundi 29 juillet 2013 à 12:06 +0300, Furkan KAMACI

How to run a verification process at pre-commit documents and then commit them into live indexes if they are valid?

2013-08-02 Thread Furkan KAMACI
I use Solr 4.2.1 as SolrCloud. My live indexes will be search by huge amounts of users and I don't want to have anything wrong. I have some criteria for my indexes. i.e. there mustn't be spam documents at my index (I have a spam detector tool), some documents should be at first result page (or

Re: How to run a verification process at pre-commit documents and then commit them into live indexes if they are valid?

2013-08-02 Thread Furkan KAMACI
of deleteQueries with all undesired spam words before commit? On Fri, Aug 2, 2013 at 12:48 PM, Furkan KAMACI furkankam...@gmail.com wrote: I use Solr 4.2.1 as SolrCloud. My live indexes will be search by huge amounts of users and I don't want to have anything wrong. I have some criteria

Top Ten Terms

2013-08-05 Thread Furkan KAMACI
Hi; When I click Schema Browser at Admin Page and load term info for a field I get top ten terms. When I click question mark near it, it redirects me to Solr query page. Query is that at page: http://localhost:8983/solr/#/collection1/query?q=content:[* TO *] What is the relation between that

Re: Top Ten Terms

2013-08-05 Thread Furkan KAMACI
, 2013 at 10:48 AM, Furkan KAMACI wrote: Hi; When I click Schema Browser at Admin Page and load term info for a field I get top ten terms. When I click question mark near it, it redirects me to Solr query page. Query is that at page: http://localhost:8983/solr/#/collection1/query?q

Getter API for SolrCloud

2013-08-15 Thread Furkan KAMACI
I've implemented an application that connects my UI and SolrCloud. I want to write a code that makes a search request to SolrCloud and I will send result to my UI. I know that there are some examples about it by I want a fast and really good way for it. One way I did: ModifiableSolrParams params

Re: Getter API for SolrCloud

2013-08-15 Thread Furkan KAMACI
Here is a conversation about it: http://lucene.472066.n3.nabble.com/SolrCloud-with-Zookeeper-ensemble-in-production-environment-SEVERE-problems-td4047089.html However the result of conversation is not clear. Any ideas? 2013/8/15 Furkan KAMACI furkankam...@gmail.com I've implemented

Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Furkan KAMACI
Hi; I want to write an analyzer that will prevent some special words. For example sentence to be indexed is: diet follower it will tokenize it as like that token 1) diet token 2) follower token 3) diet follower How can I do that with Solr?

Re: Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Furkan KAMACI
prevent any keywords. You need to elaborate the specific requirements with more detail. Given a long stream of text, what tokenization do you expect in the index? -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Monday, August 19, 2013 8:07 AM To: solr-user@lucene.apache.org

Re: Prevent Some Keywords at Analyzer Step

2013-08-21 Thread Furkan KAMACI
Krupansky -Original Message- From: Furkan KAMACI Sent: Monday, August 19, 2013 11:22 AM To: solr-user@lucene.apache.org Subject: Re: Prevent Some Keywords at Analyzer Step Let's assume that my sentence is that: *Alice is a diet follower* My special keyword = *diet

Re: Solr Indexing Status

2013-08-21 Thread Furkan KAMACI
You know the size of CSV files and you can calculate it if you want. 2013/8/21 Prasi S prasi1...@gmail.com Hi, I am using solr 4.4 to index csv files. I am using solrj for this. At frequent intervels my user may request for Status. I have to send get something like in DIH Indexing in

How to Manage RAM Usage at Heavy Indexing

2013-08-24 Thread Furkan KAMACI
I make a test at my SolrCloud. I try to send 100 millions documents into my node which has no replica via Hadoop. When document count send to that node is around 30 millions, RAM usage of my machine becomes 99% (Solr Heap Usage is not 99%, it uses just 3GB - 4GB of RAM). After a time later my node

Re: How to Manage RAM Usage at Heavy Indexing

2013-08-25 Thread Furkan KAMACI
providing much in the way of details to help us help you. Best Erick On Sat, Aug 24, 2013 at 1:52 PM, Furkan KAMACI furkankam...@gmail.com wrote: I make a test at my SolrCloud. I try to send 100 millions documents into my node which has no replica via Hadoop. When document count send

Dropping Caches of Machine That Solr Runs At

2013-08-25 Thread Furkan KAMACI
Sometimes Physical Memory usage of Solr is over %99 and this may cause problems. Do you run such kind of a command periodically: sudo sh -c sync; echo 3 /proc/sys/vm/drop_caches to force dropping caches of machine that Solr runs at and avoid problems?

Re: Dropping Caches of Machine That Solr Runs At

2013-08-25 Thread Furkan KAMACI
not drop caches after a time later and why recovery resulted with such a high Physical Memory usage.) 2013/8/25 Furkan KAMACI furkankam...@gmail.com Sometimes Physical Memory usage of Solr is over %99 and this may cause problems. Do you run such kind of a command periodically: sudo sh -c sync

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
Hi Walter; You are right about performance. However when I index documents on a machine that has a high percentage of Physical Memory usage I get EOF errors? 2013/8/26 Walter Underwood wun...@wunderwood.org On Aug 25, 2013, at 1:41 PM, Furkan KAMACI wrote: Sometimes Physical Memory usage

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
...@wunderwood.org What is the precise error? What kind of machine? File buffers are a robust part of the OS. Unix has had file buffer caching for decades. wunder On Aug 26, 2013, at 1:37 AM, Furkan KAMACI wrote: Hi Walter; You are right about performance. However when I index

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
buffers. wunder On Aug 26, 2013, at 9:17 AM, Furkan KAMACI wrote: It has a 48 GB of RAM and index size is nearly 100 GB at each node. I have CentOS 6.4. While indexing I got that error and I am suspicious about that it is because of high percentage of Physical Memory usage. ERROR - 2013

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
is 8GB. We have several cores, totaling about 14GB on disk. This configuration allows 100% of the indexes to be in file buffers. wunder On Aug 26, 2013, at 9:57 AM, Furkan KAMACI wrote: Hi Walter; You said you are caching your documents. What is average Physical Memory usage of your Solr

Re: Unknown attribute id in add:allowDups

2013-09-07 Thread Furkan KAMACI
I did not use the Pecl package and the problem maybe about that. I want to ask that when you define your schema you indicate that: *required=true* However error says: *allowDups* for id field. So it seems that id is not a unique field for that package. You may need to config anything else at

Re: Connection Established but waiting for response for a long time.

2013-09-07 Thread Furkan KAMACI
Could you give us more information about your other Jetty configurations? 2013/9/6 qungg qzheng1...@gmail.com Hi, I'm runing solr 4.0 but using legacy distributed search set up. I set the shards parameter for search, but indexing into each solr shards directly. The problem I have been

Re: Indexing pdf files - question.

2013-09-07 Thread Furkan KAMACI
Could you show us logs you get when you start your web container? 2013/9/4 Nutan Shinde nutanshinde1...@gmail.com My solrconfig.xml is: requestHandler name=/update/extract class=solr.extraction.ExtractingRequestHandler lst name=defaults str name=fmap.contentdesc/str !-to map this

Re: Adding weight to location of the string found

2013-09-07 Thread Furkan KAMACI
Firstly, did you check here: http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/package-summary.html#package_description 2013/8/28 zseml zs...@hotmail.com In Solr syntax, is there a way to add weight to the result found based on the location of the string that it's found?

Re: Regarding reducing qtime

2013-09-07 Thread Furkan KAMACI
What is your question here? 2013/9/6 prabu palanisamy pr...@serendio.com Hi I am currently using solr -3.5.0 indexed by wikipedia dump (50 gb) with java 1.6. I am searching the tweets in the solr. Currently it takes average of 210 millisecond for each post, out of which 200 millisecond is

Re: Can we used CloudSolrServer for searching data

2013-09-07 Thread Furkan KAMACI
Shalin is right. If you read the documentation for CloudSolrServer you will see that: *SolrJ client class to communicate with SolrCloud. Instances of this class communicate with Zookeeper to discover Solr endpoints for SolrCloud collections, and then use the LBHttpSolrServer to issue requests.* *

Re: SolrCloud - shard containing an invalid host:port

2013-09-07 Thread Furkan KAMACI
If that line(192.168.1.10:8983/solr) is not green and gray then probably it is because of you started up a Solr instance without defining a port and it has registered itself into Zookeeper. 2013/9/3 Daniel Collins danwcoll...@gmail.com Was it a test instance that you created 8983 is the

Re: Tweaking boosts for more search results variety

2013-09-07 Thread Furkan KAMACI
What do you mean with *these limitations *Do you want to make multiple grouping at same time? 2013/9/6 Sai Gadde gadde@gmail.com Thank you Jack for the suggestion. We can try group by site. But considering that number of sites are only about 1000 against the index size of 5 million, One

Re: How to Manage RAM Usage at Heavy Indexing

2013-09-09 Thread Furkan KAMACI
be tuned in /etc/sysctl.conf On Sun, Aug 25, 2013 at 4:23 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Erick; I wanted to get a quick answer that's why I asked my question as that way. Error is as follows: INFO - 2013-08-21 22:01:30.978

Re: using tika inside SOLR vs using nutch

2013-09-10 Thread Furkan KAMACI
If you have tens of millions of documents to parse and do want to do that job inside Solr than it means that you will make a workload on Solr. If there are many queries into your Solr node than you should consider that CPU and RAM may not be enough for you while both parsing and somebody is

Re: Solr PingQuery

2013-09-16 Thread Furkan KAMACI
I want to add one more thing for Shawn about Zookeeper. In order to have quorum, you need to have half the servers plus one available. Because of that let's assume you have 4 machine of Zookeeper and two of them communicating within them and other two of them communicating within them. Assume

Re: Slow query at first time

2013-09-16 Thread Furkan KAMACI
What is query time of your search? I mean as like that: QueryResponse solrResponse = query(solrParams); solrResponse.getQTime(); 2013/9/16 Sergio Stateri stat...@gmail.com Hi, I´m trying to make a search with Solr 4.4, but in the first time the search is too slow. I have studied about

Re: Solr node goes down while trying to index records

2013-09-17 Thread Furkan KAMACI
Do you get that error only when indexing? 2013/9/17 neoman harira...@gmail.com Hello everyone, one or more of the nodes in the solrcloud go down randomly when we try to index data using solrj APIs. The nodes do recover. but when we try to index back, they go down again Our configuration:

Re: Stop zookeeper from batch

2013-09-17 Thread Furkan KAMACI
Are you looking for that: https://issues.apache.org/jira/browse/ZOOKEEPER-1122 16 Eylül 2013 Pazartesi tarihinde Prasi S prasi1...@gmail.com adlı kullanıcı şöyle yazdı: Hi, We have setup solrcloud with zookeeper and 2 tomcats . we are using a batch file to start the zookeeper, uplink config

Limits of Document Size at SolrCloud and Faced Problems with Large Size of Documents

2013-09-17 Thread Furkan KAMACI
Currently I hafer over 50+ millions documents at my index and as I mentiod before at another question I have some problems while indexing (jetty EOF exception) I know that problem may not be about index size but just I want to learn that is there any limit for document size at Solr that if I

Re: Solr node goes down while trying to index records

2013-09-17 Thread Furkan KAMACI
Could you give some information about your jetty.xml and give more info about your index rate and RAM usage of your machines? 17 Eylül 2013 Salı tarihinde neoman harira...@gmail.com adlı kullanıcı şöyle yazdı: yes. the nodes go down while indexing. if we stop indexing, it does not go down.

Re: tlog after commit

2013-09-17 Thread Furkan KAMACI
Did you check here: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ 17 Eylül 2013 Salı tarihinde Alejandro Calbazana acalbaz...@gmail.com adlı kullanıcı şöyle yazdı: Quick question... Should I still see tlog files after a hard commit? I'm

Re: Problem indexing windows files

2013-09-17 Thread Furkan KAMACI
Firstly; This may not be a Solr related problem. Did you check the log file of Solr? Tika mayhave some circumstances at some kind of situations. For example when parsing HTML that has a base64 encoded image it may have some problems. If you find the correct logs you can detect it. On the other

Re: Some text not indexed in solr4.4

2013-09-17 Thread Furkan KAMACI
On the other hand did you check here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters what it says about MultiPhraseQuery? 18 Eylül 2013 Çarşamba tarihinde Furkan KAMACI furkankam...@gmail.com adlı kullanıcı şöyle yazdı: Hi; Did you run commit command? 18 Eylül 2013 Çarşamba

Re: Some text not indexed in solr4.4

2013-09-17 Thread Furkan KAMACI
Hi; Did you run commit command? 18 Eylül 2013 Çarşamba tarihinde Utkarsh Sengar utkarsh2...@gmail.com adlı kullanıcı şöyle yazdı: To add to it, I see the exact problem with the queries: nikon d7100, nikon d5100, samsung ps-we450 etc. Thanks, -Utkarsh On Tue, Sep 17, 2013 at 2:20 PM,

Re: Solrcloud - adding a node as a replica?

2013-09-18 Thread Furkan KAMACI
Are yoh looking for that: http://lucene.472066.n3.nabble.com/SOLR-Cloud-Collection-Management-quesiotn-td4063305.html 18 Eylül 2013 Çarşamba tarihinde didier deshommes dfdes...@gmail.com adlı kullanıcı şöyle yazdı: Hi, How do I add a node as a replica to a solrcloud cluster? Here is my

Re: Re: Unable to getting started with SOLR

2013-09-18 Thread Furkan KAMACI
I suggest you to start from here: http://wiki.apache.org/solr/HowToCompileSolr 15 Eylül 2013 Pazar tarihinde Erick Erickson erickerick...@gmail.com adlı kullanıcı şöyle yazdı: If you're using the default jetty container, there's no log unless you set it up, the content is echoed to the

Re: solr performance against oracle

2013-09-18 Thread Furkan KAMACI
Martin Fowler and Sadagale has a nice book about such kind of architectural designs: NoSQL Distilled Emerging Polyglot Persistence.If you read it you will see why to use a NoSQL or an RDBMS or both of them. On the other hand I have over 50+ millions of documents at a replicated nodes of SolrCloud

Re: Solrcloud - adding a node as a replica?

2013-09-19 Thread Furkan KAMACI
Do not hesitate to ask questions if you have any problems about it. 2013/9/19 didier deshommes dfdes...@gmail.com Thanks Furkan, That's exactly what I was looking for. On Wed, Sep 18, 2013 at 4:21 PM, Furkan KAMACI furkankam...@gmail.com wrote: Are yoh looking for that: http

Re: I can't open the admin page, it's always loading.

2013-09-19 Thread Furkan KAMACI
Could you paste your jetty logs of when you try to open admin page. 19 Eylül 2013 Perşembe tarihinde Micheal Chao fisher030...@hotmail.com adlı kullanıcı şöyle yazdı: Hi, I followed the tutoral to download solr4.4 and unzip it, and then i started jetty. i can post data and search correctly, but

Re: Problem with stopword

2013-09-19 Thread Furkan KAMACI
Firstly, you houl read here: https://cwiki.apache.org/confluence/display/solr/Running+Your+Analyzer Secondly, when you write a quey stop word are filtered from your query if you use stop word analyzer so there will not be anything else to search. 19 Eylül 2013 Perşembe tarihinde mpcmarcos

Re: Cause of NullPointer Exception? (Solr with Spring Data)

2013-09-21 Thread Furkan KAMACI
Your solr server may not bet working correctly. You should give us information about your solr logs instead of Spring. Can you reach Solr admin page? 20 Eylül 2013 Cuma tarihinde JMill apprentice...@googlemail.com adlı kullanıcı şöyle yazdı: I am unsure about the cause of the following

Near Duplicate Document Detection at Solr

2013-09-22 Thread Furkan KAMACI
will be proud to contribute and adopt it into Solr. Thanks; Furkan KAMACI

Re: Near Duplicate Document Detection at Solr

2013-09-22 Thread Furkan KAMACI
at SolrCloud? What do you think? 2013/9/22 Furkan KAMACI furkankam...@gmail.com I want to detect near duplicate documents (for web documents). I know that there is an algorithm called Winnowing and there is another technique used by Google. However I also know that Solr has a component called

Re: deployee issu on solr

2013-09-23 Thread Furkan KAMACI
Could you send your error? 2013/9/23 Ramesh ramesh.po...@vensaiinc.com Unable to deploying solr 4.4 on JBoss -4.0.0 I am getting error like

Re: Problem running EmbeddedSolr (spring data)

2013-09-24 Thread Furkan KAMACI
Run maven dependency tree command and you can easily understand the cause of dependency conflict if not you can send your command line output and we can help you. 21 Eylül 2013 Cumartesi tarihinde Erick Erickson erickerick...@gmail.com adlı kullanıcı şöyle yazdı: bq: Caused by:

Re: Xml file is not inserting from code java -jar post.jar *.xml

2013-09-26 Thread Furkan KAMACI
You should start to read from here: http://lucene.apache.org/solr/4_4_0/tutorial.html 2013/9/26 Kishan Parmar kishan@gmail.com http://www.coretechnologies.com/products/AlwaysUp/Apps/RunApacheSolrAsAService.html \ this is the link from where i fown the solr installation Regards,

Re: App server?

2013-10-03 Thread Furkan KAMACI
I've answered a similar question before as like yours. Here is my thoughts: Of course you may have some reasons to use Tomcat or anything else (i.e. your stuff may have more experience at Tomcat etc.) However developers generally runsJetty because it is default for Solr and I should point that

Regex Search at Solr

2013-10-03 Thread Furkan KAMACI
I have two questions: * * *First one:* I have a url field at my index. I have some supported protocols. i.e. http and https. How can I list the urls at my index that has is not a supported url? (which query parser do you suggest for such kind of purposes)? http://www.google.com/sfdsd sfsdf

Re: massive memory consumption of grouping feature

2013-10-03 Thread Furkan KAMACI
Hi Alok; Please do not reply an old message at mail list. Users may not see the question. Instead of that start a new thread and give a link to original one. 2013/10/3 Alok Bhandari alokomprakashbhand...@gmail.com Did find any solution to this. I am also facing the same issue. -- View

<    1   2   3   4   5   6   7   8   >