Re: How can I learn the total count of how many documents indexed and how many documents updated?

2013-07-18 Thread Furkan KAMACI
/18 Shawn Heisey > On 7/17/2013 8:06 AM, Furkan KAMACI wrote: > > I have crawled some web pages and indexed them at my SolrCloud(Solr > 4.2.1). > > However before I index them there was already some indexes. I can > calculate > > the difference between current and previ

Re: How can I learn the total count of how many documents indexed and how many documents updated?

2013-07-18 Thread Furkan KAMACI
> > 41 > > 15000ms > > 37 > > 0 > > 2 > > 0 > > 0 > > 0 > > 0 > > 0 > > 0 > > 0 > > 211453 > > 0 > > 0 > > 0 > > > > > > I think that there is no information about what I

IDNA Support For Solr

2013-07-19 Thread Furkan KAMACI
Hi; Is there any support for IDNA at Solr? (IDNA: http://en.wikipedia.org/wiki/Internationalized_domain_name)

Re: IDNA Support For Solr

2013-07-19 Thread Furkan KAMACI
I mean that: there is a web adress: *çorba.com * However its IDNA coded version is: *xn--orba-zoa.com* You can check it from here: * http://www.whois.com.tr/?q=%C3%A7orba&sldtld=com* Let's assume that I've indexed a web page with that URL: *xn--orba-zoa.com*and one sear

Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Furkan KAMACI
Hi; Sometimes a huge part of a document may exist in another document. As like in student plagiarism or quotation of a blog post at another blog post. Does Solr/Lucene or its libraries (UIMA, OpenNLP, etc.) has any class to detect it?

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Furkan KAMACI
Actually I need a specialized algorithm. I want to use that algorithm to detect duplicate blog posts. 2013/7/23 Tommaso Teofili > Hi, > > I you may leverage and / or improve MLT component [1]. > > HTH, > Tommaso > > [1] : http://wiki.apache.org/solr/MoreLikeThis >

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Furkan KAMACI
blogposts copies" text > 3. get the first, say, 2-3 hits > 4. mark it as quote / plagiarism > 5. eventually train a classifier to help you mark other texts as quote / > plagiarism > > HTH, > Tommaso > > > > 2013/7/23 Furkan KAMACI > > > Actua

WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Furkan KAMACI
Hi; I have indexed wikipedia data with Solr DIH. However when I look data that is indexed at Solr I something like that as well: {| style="text-align: left; width: 50%; table-layout: fixed;" border="0" |- valign="top" | style="width: 50%"| :*[[Ubuntu]] :*[[Fedora]] :*[[Mandriva]] :*[[Linux Mint]]

Re: WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Furkan KAMACI
dmin UI analysis page? > > You should just see the text tokens plus the URLs for links. > > -- Jack Krupansky > > -Original Message- From: Furkan KAMACI > Sent: Tuesday, July 23, 2013 10:53 AM > To: solr-user@lucene.apache.org > Subject: WikipediaTokenizer for R

Usage Of Real Time Get Handler Of Solr

2013-07-24 Thread Furkan KAMACI
Hi; There is a real time get handler at Solr: true json true ${solr.ulog.dir:}

How to Read Solr Admin Stats Page?

2013-07-24 Thread Furkan KAMACI
I am indexing and check the admin stats page. I see that: commits:471 autocommit maxTime:15000ms autocommits:414 soft autocommits:0 optimizes:12 docsPending:388 adds:305 cumulative_adds:2154245

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-25 Thread Furkan KAMACI
> would do the rest for you :) > > roman > > > On Tue, Jul 23, 2013 at 11:07 AM, Shashi Kant wrote: > > > Here is a paper that I found useful: > > http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf > > > > > > On Tue, Jul

Difference between qf and pf parameters

2013-07-26 Thread Furkan KAMACI
Here is an example from example solrconfig file: content^0.5 anchor^1.0 title^1.2 content^0.5 anchor^1.5 title^1.2 site^1.5 What is the difference between qf and pf parameters, they both boost fields both there should be a difference?

Highlight Problem

2013-07-26 Thread Furkan KAMACI
This was at example solrconfig file: dismax explicit 0.01 content^0.5 anchor^1.0 title^1.2 content^0.5 anchor^1.5 title^1.2 site^1.5 url 100 true *:* title url content 0 title 0 url regex

Re: Highlight Problem

2013-07-26 Thread Furkan KAMACI
Ok, I've found that there was not a problem at config. 2013/7/26 Furkan KAMACI > This was at example solrconfig file: > > > > dismax > explicit > 0.01 > content^0.5 anchor^1.0 title^1.2 > content^0.5 anchor^1.5 title^1.2

Exact Match

2013-07-26 Thread Furkan KAMACI
When I run that query: solr/select?q=url:"ftp://"&wt=xml&fl=url I get results as like that: http://forum.whmdestek.com/ftp-makaleleri/ http://www.netadi.com/ftp-kurulumu.php Why it does not make an exact search find: *ftp://* ?

Synonym Phrase

2013-07-26 Thread Furkan KAMACI
I have a synonyms file as like that: cart; shopping cart; market trolley When I analyse my query I see that when I search cart these becomes synonyms: cart, shopping, market, trolley so cart is synonym with shopping. How should I define my synonyms.txt file that it will understand that cart is

How to Make That Domains Should Be First?

2013-07-26 Thread Furkan KAMACI
When I search wikipedia the home page of wikipedia is not at first result: http://www.wikipedia.org/ first result is that: http://en.wikipedia.org/wiki/Spain How can I say that domains of web sites should be first at SolrCloud? (I want something like grouping at domains and boosting at url leng

Re: Synonym Phrase

2013-07-26 Thread Furkan KAMACI
n preprocessing could be as simple as scanning for the synonym > phrases and then adding "OR" terms for the synonym phrases. > > -- Jack Krupansky > > -Original Message- From: Furkan KAMACI > Sent: Friday, July 26, 2013 10:53 AM > To: solr-user@lucene.apache.org

Re: Synonym Phrase

2013-07-26 Thread Furkan KAMACI
Should I re write it as like that: shopping cart => market trolley, cart or somethinglike that? 2013/7/26 Furkan KAMACI > Why Solr does not split that terms by*;* I think that it both > split by *;* and white space character? > > > 2013/7/26 Jack Krupansky >

Exact Search Problem

2013-07-26 Thread Furkan KAMACI
Let's assume that I have that urls at my index: www.abc.com www.abc.com/a www.abc.com/b www.abc.com/c ... www.abc.com/x How can I exact search for www.abc.com ? url:"www.abc.com" doesn't works because it returns both www.abc.com/a, www.abc.com/b etc?

Query Performance

2013-07-28 Thread Furkan KAMACI
What is the difference between: q=*:*&rows=row_count&sort=id asc and q={X TO *}&rows=row_count&sort=id asc Does the first one trys to get all the documents but cut the result or they are same or...? What happens at underlying process of Solr for that two queries?

Re: Query Performance

2013-07-28 Thread Furkan KAMACI
f documents, relative to whatever > row_count might be. > > Generally, you should only sort a small number of documents/results. > > Or, consider DocValues since they are designed for sorting. > > -- Jack Krupansky > > -Original Message- From: Furkan KAMACI

RAM Usage Debugging

2013-07-29 Thread Furkan KAMACI
When I look at my dashboard I see that 27.30 GB available for JVM, 24.77 GB is gray and 16.50 GB is black. I don't do anything on my machine right now. Did it cache documents or is there any problem, how can I learn it?

AND Queries

2013-07-29 Thread Furkan KAMACI
I am searching for a keyword as like that: lang:en AND url:book pencil cat It returns me results however none of them includes both book, pencil and cat keywords. How should I rewrite my query? I tried this: lang:en AND url:(book AND pencil AND cat) and looks like OK. However this not: lang:

Re: AND Queries

2013-07-29 Thread Furkan KAMACI
any field, it's the default field > which is used. > > So, > lang:en AND url:book AND pencil AND cat > > is interpreted as : > ang:en AND url:book AND :pencil AND :cat > > > The default search field is defined in your schema.xml file > (defaultSearchField) >

How to run a verification process at pre-commit documents and then commit them into live indexes if they are valid?

2013-08-02 Thread Furkan KAMACI
I use Solr 4.2.1 as SolrCloud. My live indexes will be search by huge amounts of users and I don't want to have anything wrong. I have some criteria for my indexes. i.e. there mustn't be spam documents at my index (I have a spam detector tool), some documents should be at first result page (or with

Re: How to run a verification process at pre-commit documents and then commit them into live indexes if they are valid?

2013-08-02 Thread Furkan KAMACI
However, if the verification procedure is automated, can't you just submit > hundred of deleteQueries with all undesired spam words before commit? > > > On Fri, Aug 2, 2013 at 12:48 PM, Furkan KAMACI >wrote: > > > I use Solr 4.2.1 as SolrCloud. My live indexes will be

Top Ten Terms

2013-08-05 Thread Furkan KAMACI
Hi; When I click Schema Browser at Admin Page and load term info for a field I get top ten terms. When I click question mark near it, it redirects me to Solr query page. Query is that at page: http://localhost:8983/solr/#/collection1/query?q=content:[* TO *] What is the relation between that que

Re: Top Ten Terms

2013-08-05 Thread Furkan KAMACI
king at. > > HTH > Stefan > > > > On Monday, August 5, 2013 at 10:48 AM, Furkan KAMACI wrote: > > > Hi; > > > > When I click Schema Browser at Admin Page and load term info for a field > I > > get top ten terms. When I click question mark near it, i

Getter API for SolrCloud

2013-08-15 Thread Furkan KAMACI
I've implemented an application that connects my UI and SolrCloud. I want to write a code that makes a search request to SolrCloud and I will send result to my UI. I know that there are some examples about it by I want a fast and really good way for it. One way I did: ModifiableSolrParams params =

Re: Getter API for SolrCloud

2013-08-15 Thread Furkan KAMACI
Here is a conversation about it: http://lucene.472066.n3.nabble.com/SolrCloud-with-Zookeeper-ensemble-in-production-environment-SEVERE-problems-td4047089.html However the result of conversation is not clear. Any ideas? 2013/8/15 Furkan KAMACI > I've implemented an application that con

Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Furkan KAMACI
Hi; I want to write an analyzer that will prevent some special words. For example sentence to be indexed is: diet follower it will tokenize it as like that token 1) diet token 2) follower token 3) diet follower How can I do that with Solr?

Re: Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Furkan KAMACI
prevent" any keywords. > > You need to elaborate the specific requirements with more detail. > > Given a long stream of text, what tokenization do you expect in the index? > > -- Jack Krupansky > > -Original Message- From: Furkan KAMACI Sent: Monday, August 19, >

Re: Prevent Some Keywords at Analyzer Step

2013-08-21 Thread Furkan KAMACI
or term. > >> > >> So, I'm still baffled as to what you are really trying to do. Trying > >> explaining it in plain English. > >> > >> And given this same input, how would it be queried? > >> > >> > >> -- Jack Krupansky >

Re: Solr Indexing Status

2013-08-21 Thread Furkan KAMACI
You know the size of CSV files and you can calculate it if you want. 2013/8/21 Prasi S > Hi, > I am using solr 4.4 to index csv files. I am using solrj for this. At > frequent intervels my user may request for "Status". I have to send get > something like in DIH " Indexing in progress.. Added x

How to Manage RAM Usage at Heavy Indexing

2013-08-24 Thread Furkan KAMACI
I make a test at my SolrCloud. I try to send 100 millions documents into my node which has no replica via Hadoop. When document count send to that node is around 30 millions, RAM usage of my machine becomes 99% (Solr Heap Usage is not 99%, it uses just 3GB - 4GB of RAM). After a time later my node

Re: How to Manage RAM Usage at Heavy Indexing

2013-08-25 Thread Furkan KAMACI
t doesn't work, how can > I fix it?" without providing much in the way of details to help us help > you. > > Best > Erick > > > > On Sat, Aug 24, 2013 at 1:52 PM, Furkan KAMACI >wrote: > > > I make a test at my SolrCloud. I try to send 100 millions

Dropping Caches of Machine That Solr Runs At

2013-08-25 Thread Furkan KAMACI
Sometimes Physical Memory usage of Solr is over %99 and this may cause problems. Do you run such kind of a command periodically: sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches" to force dropping caches of machine that Solr runs at and avoid problems?

Re: Dropping Caches of Machine That Solr Runs At

2013-08-25 Thread Furkan KAMACI
s not drop caches after a time later and why recovery resulted with such a high Physical Memory usage.) 2013/8/25 Furkan KAMACI > Sometimes Physical Memory usage of Solr is over %99 and this may cause > problems. Do you run such kind of a command periodically: > > sudo sh -c "sy

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
Hi Walter; You are right about performance. However when I index documents on a machine that has a high percentage of Physical Memory usage I get EOF errors? 2013/8/26 Walter Underwood > On Aug 25, 2013, at 1:41 PM, Furkan KAMACI wrote: > > > Sometimes Physical Memory usage of

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
> What is the precise error? What kind of machine? > > File buffers are a robust part of the OS. Unix has had file buffer caching > for decades. > > wunder > > On Aug 26, 2013, at 1:37 AM, Furkan KAMACI wrote: > > > Hi Walter; > > > > You are right about p

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
> wunder > > On Aug 26, 2013, at 9:17 AM, Furkan KAMACI wrote: > > > It has a 48 GB of RAM and index size is nearly 100 GB at each node. I > have > > CentOS 6.4. While indexing I got that error and I am suspicious about > that > > it is because of high percentage of P

Re: Dropping Caches of Machine That Solr Runs At

2013-08-26 Thread Furkan KAMACI
ave several cores, totaling about 14GB on disk. This configuration allows 100% of the indexes to be in file buffers. > > wunder > > On Aug 26, 2013, at 9:57 AM, Furkan KAMACI wrote: > >> Hi Walter; >> >> You said you are caching your documents. What is average Physica

Re: Unknown attribute id in add:allowDups

2013-09-07 Thread Furkan KAMACI
I did not use the Pecl package and the problem maybe about that. I want to ask that when you define your schema you indicate that: *required="true"* However error says: *allowDups* for id field. So it seems that id is not a unique field for that package. You may need to config anything else at

Re: Connection Established but waiting for response for a long time.

2013-09-07 Thread Furkan KAMACI
Could you give us more information about your other Jetty configurations? 2013/9/6 qungg > Hi, > > I'm runing solr 4.0 but using legacy distributed search set up. I set the > shards parameter for search, but indexing into each solr shards directly. > The problem I have been experiencing is buil

Re: Indexing pdf files - question.

2013-09-07 Thread Furkan KAMACI
Could you show us logs you get when you start your web container? 2013/9/4 Nutan Shinde > My solrconfig.xml is: > > > > class="solr.extraction.ExtractingRequestHandler" > > > > > descwhich > is defined as shown below in schem.xml--> > > true > > attr_ > > true > > > > > > > > > > Schem

Re: Adding weight to location of the string found

2013-09-07 Thread Furkan KAMACI
Firstly, did you check here: http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/search/package-summary.html#package_description 2013/8/28 zseml > In Solr syntax, is there a way to add weight to the result found based on > the > location of the string that it's found? > > For instance, i

Re: Regarding reducing qtime

2013-09-07 Thread Furkan KAMACI
What is your question here? 2013/9/6 prabu palanisamy > Hi > > I am currently using solr -3.5.0 indexed by wikipedia dump (50 gb) with > java 1.6. I am searching the tweets in the solr. Currently it takes average > of 210 millisecond for each post, out of which 200 millisecond is consumed > by

Re: Can we used CloudSolrServer for searching data

2013-09-07 Thread Furkan KAMACI
Shalin is right. If you read the documentation for CloudSolrServer you will see that: *SolrJ client class to communicate with SolrCloud. Instances of this class communicate with Zookeeper to discover Solr endpoints for SolrCloud collections, and then use the LBHttpSolrServer to issue requests.* *

Re: SolrCloud - shard containing an invalid host:port

2013-09-07 Thread Furkan KAMACI
If that line(192.168.1.10:8983/solr) is not green and gray then probably it is because of you started up a Solr instance without defining a port and it has registered itself into Zookeeper. 2013/9/3 Daniel Collins > Was it a test instance that you created 8983 is the default port, so > possibly

Re: Tweaking boosts for more search results variety

2013-09-07 Thread Furkan KAMACI
What do you mean with "*these limitations" *Do you want to make multiple grouping at same time? 2013/9/6 Sai Gadde > Thank you Jack for the suggestion. > > We can try group by site. But considering that number of sites are only > about 1000 against the index size of 5 million, One can expect mo

Re: How to Manage RAM Usage at Heavy Indexing

2013-09-09 Thread Furkan KAMACI
uned in /etc/sysctl.conf > > > On Sun, Aug 25, 2013 at 4:23 PM, Furkan KAMACI >wrote: > > > Hi Erick; > > > > I wanted to get a quick answer that's why I asked my question as that > way. > > > > Error is as follows: > > > > INFO - 201

Re: using tika inside SOLR vs using nutch

2013-09-10 Thread Furkan KAMACI
If you have tens of millions of documents to parse and do want to do that job inside Solr than it means that you will make a workload on Solr. If there are many queries into your Solr node than you should consider that CPU and RAM may not be enough for you while both parsing and somebody is queryin

Re: Solr PingQuery

2013-09-16 Thread Furkan KAMACI
I want to add one more thing for Shawn about Zookeeper. In order to have quorum, you need to have half the servers plus one available. Because of that let's assume you have 4 machine of Zookeeper and two of them communicating within them and other two of them communicating within them. Assume that

Re: Slow query at first time

2013-09-16 Thread Furkan KAMACI
What is query time of your search? I mean as like that: QueryResponse solrResponse = query(solrParams); solrResponse.getQTime(); 2013/9/16 Sergio Stateri > Hi, > > I´m trying to make a search with Solr 4.4, but in the first time the search > is too slow. I have studied about pre-warm queries,

Re: Solr node goes down while trying to index records

2013-09-17 Thread Furkan KAMACI
Do you get that error only when indexing? 2013/9/17 neoman > Hello everyone, > one or more of the nodes in the solrcloud go down randomly when we try to > index data using solrj APIs. The nodes do recover. but when we try to index > back, they go down again > > Our configuration: > 3 shards > S

Re: Stop zookeeper from batch

2013-09-17 Thread Furkan KAMACI
Are you looking for that: https://issues.apache.org/jira/browse/ZOOKEEPER-1122 16 Eylül 2013 Pazartesi tarihinde Prasi S adlı kullanıcı şöyle yazdı: > Hi, > We have setup solrcloud with zookeeper and 2 tomcats . we are using a batch > file to start the zookeeper, uplink config files and start to

Limits of Document Size at SolrCloud and Faced Problems with Large Size of Documents

2013-09-17 Thread Furkan KAMACI
Currently I hafer over 50+ millions documents at my index and as I mentiod before at another question I have some problems while indexing (jetty EOF exception) I know that problem may not be about index size but just I want to learn that is there any limit for document size at Solr that if I exceed

Re: Solr node goes down while trying to index records

2013-09-17 Thread Furkan KAMACI
Could you give some information about your jetty.xml and give more info about your index rate and RAM usage of your machines? 17 Eylül 2013 Salı tarihinde neoman adlı kullanıcı şöyle yazdı: > yes. the nodes go down while indexing. if we stop indexing, it does not go > down. > > > > -- > View this

Re: tlog after commit

2013-09-17 Thread Furkan KAMACI
Did you check here: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ 17 Eylül 2013 Salı tarihinde Alejandro Calbazana adlı kullanıcı şöyle yazdı: > Quick question... Should I still see tlog files after a hard commit? > > I'm trying to test soft

Re: Problem indexing windows files

2013-09-17 Thread Furkan KAMACI
Firstly; This may not be a Solr related problem. Did you check the log file of Solr? Tika mayhave some circumstances at some kind of situations. For example when parsing HTML that has a base64 encoded image it may have some problems. If you find the correct logs you can detect it. On the other tak

Re: Some text not indexed in solr4.4

2013-09-17 Thread Furkan KAMACI
On the other hand did you check here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters what it says about MultiPhraseQuery? 18 Eylül 2013 Çarşamba tarihinde Furkan KAMACI adlı kullanıcı şöyle yazdı: > Hi; > > Did you run commit command? > > 18 Eylül 2013 Çarşamba ta

Re: Some text not indexed in solr4.4

2013-09-17 Thread Furkan KAMACI
Hi; Did you run commit command? 18 Eylül 2013 Çarşamba tarihinde Utkarsh Sengar adlı kullanıcı şöyle yazdı: > To add to it, I see the exact problem with the queries: "nikon d7100", > "nikon d5100", "samsung ps-we450" etc. > > Thanks, > -Utkarsh > > > On Tue, Sep 17, 2013 at 2:20 PM, Utkarsh Seng

Re: Solrcloud - adding a node as a replica?

2013-09-18 Thread Furkan KAMACI
Are yoh looking for that: http://lucene.472066.n3.nabble.com/SOLR-Cloud-Collection-Management-quesiotn-td4063305.html 18 Eylül 2013 Çarşamba tarihinde didier deshommes adlı kullanıcı şöyle yazdı: > Hi, > How do I add a node as a replica to a solrcloud cluster? Here is my > situation: some time ag

Re: Re: Unable to getting started with SOLR

2013-09-18 Thread Furkan KAMACI
I suggest you to start from here: http://wiki.apache.org/solr/HowToCompileSolr 15 Eylül 2013 Pazar tarihinde Erick Erickson adlı kullanıcı şöyle yazdı: > If you're using the default jetty container, there's no log unless > you set it up, the content is echoed to the screen. > > About a zillio

Re: solr performance against oracle

2013-09-18 Thread Furkan KAMACI
Martin Fowler and Sadagale has a nice book about such kind of architectural designs: NoSQL Distilled Emerging Polyglot Persistence.If you read it you will see why to use a NoSQL or an RDBMS or both of them. On the other hand I have over 50+ millions of documents at a replicated nodes of SolrCloud a

Re: Solrcloud - adding a node as a replica?

2013-09-19 Thread Furkan KAMACI
Do not hesitate to ask questions if you have any problems about it. 2013/9/19 didier deshommes > Thanks Furkan, > That's exactly what I was looking for. > > > On Wed, Sep 18, 2013 at 4:21 PM, Furkan KAMACI >wrote: > > > Are yoh looking for that: > > &g

Re: I can't open the admin page, it's always loading.

2013-09-19 Thread Furkan KAMACI
Could you paste your jetty logs of when you try to open admin page. 19 Eylül 2013 Perşembe tarihinde Micheal Chao adlı kullanıcı şöyle yazdı: > Hi, I followed the tutoral to download solr4.4 and unzip it, and then i > started jetty. i can post data and search correctly, but when i try to open > a

Re: Problem with stopword

2013-09-19 Thread Furkan KAMACI
Firstly, you houl read here: https://cwiki.apache.org/confluence/display/solr/Running+Your+Analyzer Secondly, when you write a quey stop word are filtered from your query if you use stop word analyzer so there will not be anything else to search. 19 Eylül 2013 Perşembe tarihinde mpcmarcos adlı

Re: Cause of NullPointer Exception? (Solr with Spring Data)

2013-09-21 Thread Furkan KAMACI
Your solr server may not bet working correctly. You should give us information about your solr logs instead of Spring. Can you reach Solr admin page? 20 Eylül 2013 Cuma tarihinde JMill adlı kullanıcı şöyle yazdı: > I am unsure about the cause of the following NullPointer Exception. Any > Ideas?

Near Duplicate Document Detection at Solr

2013-09-22 Thread Furkan KAMACI
ll be proud to contribute and adopt it into Solr. Thanks; Furkan KAMACI

Re: Near Duplicate Document Detection at Solr

2013-09-22 Thread Furkan KAMACI
ation at SolrCloud? What do you think? 2013/9/22 Furkan KAMACI > I want to detect near duplicate documents (for web documents). I know that > there is an algorithm called Winnowing and there is another technique used > by Google. However I also know that Solr has a component called

Re: deployee issu on solr

2013-09-23 Thread Furkan KAMACI
Could you send your error? 2013/9/23 Ramesh > Unable to deploying solr 4.4 on JBoss -4.0.0 I am getting error like > > > >

Re: Problem running EmbeddedSolr (spring data)

2013-09-24 Thread Furkan KAMACI
Run maven dependency tree command and you can easily understand the cause of dependency conflict if not you can send your command line output and we can help you. 21 Eylül 2013 Cumartesi tarihinde Erick Erickson adlı kullanıcı şöyle yazdı: > bq: Caused by: java.lang.NoSuchMethodError: > > This us

Re: Xml file is not inserting from code java -jar post.jar *.xml

2013-09-26 Thread Furkan KAMACI
You should start to read from here: http://lucene.apache.org/solr/4_4_0/tutorial.html 2013/9/26 Kishan Parmar > > http://www.coretechnologies.com/products/AlwaysUp/Apps/RunApacheSolrAsAService.html > \ > > this is the link from where i fown the solr installation > > Regards, > > Kishan Parmar >

Re: App server?

2013-10-02 Thread Furkan KAMACI
I've answered a similar question before as like yours. Here is my thoughts: Of course you may have some reasons to use Tomcat or anything else (i.e. your stuff may have more experience at Tomcat etc.) However developers generally runsJetty because it is default for Solr and I should point that Sol

Regex Search at Solr

2013-10-03 Thread Furkan KAMACI
I have two questions: * * *First one:* I have a url field at my index. I have some supported protocols. i.e. http and https. How can I list the urls at my index that has is not a supported url? (which query parser do you suggest for such kind of purposes)? http://www.google.com/sfdsd sfsdf sfdsf/

Re: massive memory consumption of grouping feature

2013-10-03 Thread Furkan KAMACI
Hi Alok; Please do not reply an old message at mail list. Users may not see the question. Instead of that start a new thread and give a link to original one. 2013/10/3 Alok Bhandari > Did find any solution to this. I am also facing the same issue. > > > > -- > View this message in context: > h

Re: Regex Search at Solr

2013-10-03 Thread Furkan KAMACI
I will go with that. I've just wondered about could I do it with existence test data or not. 2013/10/4 Otis Gospodnetic > Maybe storing the protocol info in a separate field would be cleaner. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On O

Re: WikipediaTokenizer documentation

2013-10-04 Thread Furkan KAMACI
I suggest you to look at here: http://www.javadocexamples.com/java_source/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerTest.java.html 2013/10/4 Ken Krugler > Hi all, > > Where's the documentation on the WikipediaTokenizer? > > Specifically I'm wondering how pieces from the source XML

Re: FileNotFoundException

2013-10-04 Thread Furkan KAMACI
Did you check here at logs: *Caused by: java.io.FileNotFoundException: /opt/solr/myCore/data/index/_2he9.si (No such file or directory)* 2013/10/4 tamanjit.bin...@yahoo.co.in > Hi, > We migrated to Solr 4.3 from 3.5 yesterday. We use multicore Master Slave > architecture and use external scrip

Re: Solr client in CPP

2013-10-04 Thread Furkan KAMACI
There is an old question like that: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201101.mbox/%3CAANLkTi=itRz7ni6HV-m=GTThzLb9G8XkWi92jBn=p...@mail.gmail.com%3E Also you check that page for general information: http://websolr.com/guides/solr-clients 2013/10/4 Neeraj Pandey > Hi a

Re: Can I pass some Object as request parameter to solr server

2013-10-04 Thread Furkan KAMACI
I've implemented a SearchRequest class at my application. It has some custom fields and filled via web services automatically from a JSON object (via jackson). Inside a core class I retrieve proper attributes from that class and send a query to Solr server. If you have an API class to reach your S

Re: UML diagrams for solr

2013-10-05 Thread Furkan KAMACI
tanding concepts are much more better way to learn it but it's up to you. Thanks; Furkan KAMACI 2013/10/5 Kishan Parmar > I Also neet this plz send us some uml or er-diagram for solr ..,.,.. > or any documentation foe solr > > > Regards, > > Kishan Parmar > Software Developer > +91 95 100 77394 > Jay Shree Krishnaa !! >

Re: Error with Solr 4.4.0, Glassfish, and CentOS 6.2

2013-10-05 Thread Furkan KAMACI
Could you explain us what is error? 2013/10/5 paul brickell > How do I 'get rid of it entirely'? >

Re: How to warm up filter queries for a category field with 1000 possible values ?

2013-10-06 Thread Furkan KAMACI
If you are asking to read from a file for warm up and if there is not a capability for what you want I can open a Jira issue and send a patch. 2013/10/7 user 01 > what's the way to warm up filter queries for a category field with 1000 > possible values. Would I need to write 1000 lines manually

Difference Between Query Time and Elapsed Time at Solrj Query Response

2013-10-07 Thread Furkan KAMACI
QueryResponse object at Solrj has two different methods for required time for a given query. One of them is for *QTime(queryTime)* and the other one is for *elapsedTime. *What are the differences between them and what exactly for elapsedTime?

Re: [SolrJ] HttpSolrServer - maxRetries

2013-10-07 Thread Furkan KAMACI
Hi Bram; Could you send you error logs? 2013/10/7 Bram Van Dam > Hi folks, > > Long story short: I'm occasionally getting exceptions under heavy load > (SocketException: Connection reset). I would expect HttpSolrServer to try > again maxRetries-times, but it doesn't. > > For reasons I don't en

Re: [SolrJ] HttpSolrServer - maxRetries

2013-10-07 Thread Furkan KAMACI
One more thing, could you say that which version of Solr you are using? 2013/10/7 Bram Van Dam > On 10/07/2013 11:51 AM, Furkan KAMACI wrote: > >> Could you send you error logs? >> > > Whoops, forgot to paste: > > > Caused by: org.apache.solr.client.solrj.**S

What is the full list of Solr Special Characters?

2013-10-08 Thread Furkan KAMACI
I found that: + - && || ! ( ) { } [ ] ^ " ~ * ? : \ at that URL: http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Escaping+Special+Characters I'm using Solr 4.5 Is there any full list of special characters to escape inside my custom search API before making a request to SolrCloud?

Re: What is the full list of Solr Special Characters?

2013-10-08 Thread Furkan KAMACI
Actually I want to remove special characters and wont send them into my Solr indexes. I mean user can send a special query as like a SQL injection and I want to prevent my system such kind of scenarios. 2013/10/8 Furkan KAMACI > I found that: > > + - && || ! ( ) { } [ ] ^ &

Effect of multiple white space at WhiteSpaceTokenizer

2013-10-08 Thread Furkan KAMACI
I use Solr 4.5 and I have a WhiteSpaceTokenizer at my schema. What is the difference (index size and performance) for that two sentences: First one: This is a sentence. Second one: This is a sentence.

Re: SolrJ best pratices

2013-10-09 Thread Furkan KAMACI
I suggest you to look at here: http://wiki.apache.org/solr/Solrj?action=fullsearch&context=180&value=cloudsolrserver&titlesearch=Titles#Using_with_SolrCloud 2013/10/9 Shawn Heisey > On 10/7/2013 3:08 PM, Mark wrote: > >> Some specific questions: >> - When working with HttpSolrServer should we k

Re: SolrCloud High Availability during indexing operation

2013-10-09 Thread Furkan KAMACI
Hi Saurabh, Your link does not work (it is broken). 2013/10/9 Saurabh Saxena > Pastbin link http://pastebin.com/cnkXhz7A > > I am doing a bulk request. I am uploading 100 files, each file having 100 > docs. > > -Saurabh > > > On Tue, Oct 8, 2013 at 7:39 PM, Mark Miller wrote: > > > The attachm

Re: synonyms and term position

2013-10-09 Thread Furkan KAMACI
Could you send screenshot of admin Analysis page when trying to analyze that words? 2013/10/9 Alvaro Cabrerizo > Hi: > > I'm involved in a process o upgrade solr from 1.4 to 4.4 and I'm having a > problem using SynonymFilterFactory within the process chain > SynonymFilterFactory, StopFilterFac

Re: Multiple schemas in the same SolrCloud ?

2013-10-09 Thread Furkan KAMACI
You can have more information from here: https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+Files 2013/10/9 xinwu > I remember I must put the > "-Dbootstrap_confdir=/opt/Solr_home/collection1/conf > -Dcollection.configName=solrConfig " in the catalina.sh .

Re: synonyms and term position

2013-10-09 Thread Furkan KAMACI
Does "two" has a synonym of "in" and "one"? 2013/10/9 Furkan KAMACI > Does "two" has a synonym of "in" and "one"? > > > 2013/10/9 Alvaro Cabrerizo > >> Sure, >> >> Find attached the screen

Re: Searching on (hyphenated/capitalized) word issue

2013-10-09 Thread Furkan KAMACI
If you have that word to index: "multicad" and if you want to get result when you search that: "multi" you can use ngram filter. However you should consider pros and cons of using Ngram Filter. If you use ngrams you may find "multicad" from "multi" but your index size will be much more bigger. I s

Re: Find documents that are composed of % words

2013-10-09 Thread Furkan KAMACI
Are you asking something like that: http://wiki.apache.org/solr/TextProfileSignature 9 Ekim 2013 Çarşamba tarihinde shahzad73 adlı kullanıcı şöyle yazdı: > Please help me formulate the query that will be easy or do i have to build a > custom filter for this ? > > Shahzad > > > > -- > View this m

<    1   2   3   4   5   6   7   8   >