>>>>>>>>>>>>>>>>>>>>>>>>>> long fileBytes = file.length(); RepositoryDocument data = new RepositoryDocument(); data.setBinary(is,fileBytes); String fileName = file.getName(); data.setFileName(fileName); data.setMimeType(mapExtensionToMimeType(fileName));
<<<<<<<<<<<<<<<<<<<<<<<<<<< do i just need to comment out 3rd line i.e. data.setBinary(is,fileBytes); ?? Thanks, Ameya On Thu, Jul 31, 2014 at 4:17 PM, Ameya Aware <ameya.aw...@gmail.com> wrote: > I could not exactly locate the position where this is happening. > > Can you please help me out with the changes? > > Thanks, > Ameya > > > > On Thu, Jul 31, 2014 at 4:10 PM, Karl Wright <daddy...@gmail.com> wrote: > >> Hi Ameya, >> >> Since you are already modifying the connector for your purposes, nothing >> is stopping you from modifying it further to not fetch the document and >> instead substitute an empty input stream. >> >> Karl >> >> >> >> On Thu, Jul 31, 2014 at 3:03 PM, Ameya Aware <ameya.aw...@gmail.com> >> wrote: >> >>> Hi, >>> >>> i have modified code a little to add different metadata fields such as >>> below (FileConnector.java): >>> >>> data.addField("created", new >>> Date((attr.creationTime().toMillis()))); >>> data.addField("last_accessed", new >>> Date(attr.lastAccessTime().toMillis())); >>> data.addField("last_modified", new >>> Date(file.lastModified())); >>> data.addField("size", file.length()); >>> >>> >>> which are being passed to Solr. >>> >>> Now can i stop MCF from reading a file and sending that content and just >>> passed above information to Solr? >>> >>> >>> Thanks, >>> Ameya >>> >>> >>> On Thu, Jul 31, 2014 at 2:57 PM, Karl Wright <daddy...@gmail.com> wrote: >>> >>>> Hi Ameya, >>>> >>>> The file system connector does not retrieve any metadata for a document >>>> at all. So I'm not sure what metadata you are talking about. >>>> >>>> Karl >>>> >>>> >>>> >>>> On Thu, Jul 31, 2014 at 2:44 PM, Ameya Aware <ameya.aw...@gmail.com> >>>> wrote: >>>> >>>>> So the thing here is i am not looking for any data or content of any >>>>> of files. I am just interested in metadata of file. >>>>> >>>>> So i thought it should be possible to not read any file and just get >>>>> metadata of file and give to Solr. >>>>> >>>>> This should save lots of time. >>>>> >>>>> Is it possible to do this? >>>>> >>>>> Thanks, >>>>> Ameya >>>>> >>>>> >>>>> >>>>> On Thu, Jul 31, 2014 at 2:13 PM, Karl Wright <daddy...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Ameya, >>>>>> >>>>>> (1) Please look at the Simple History report. Note what kinds of >>>>>> documents are being fetched, what kinds are being indexed, and how long >>>>>> it >>>>>> is taking. I have noted from your previous posts that you seem to be >>>>>> indexing a lot of very large EXE files. This is useless and you should >>>>>> be >>>>>> excluding them. >>>>>> >>>>>> (2) Please look in the manifoldcf.log file for evidence that fetches >>>>>> and/or Solr indexing requests are being retried due to errors. It >>>>>> doesn't >>>>>> take many documents being chronically retried before forward progress >>>>>> drops >>>>>> to near zero. >>>>>> >>>>>> (3) If you look into (1) & (2) and everything seems fine, it may be a >>>>>> misalignment between availability of several kinds of resources that is >>>>>> the >>>>>> problem. Please get a thread dump of the agents process while it is >>>>>> crawling, using jstack. Post that thread dump and we can tell you what >>>>>> to >>>>>> look at next. >>>>>> >>>>>> Karl >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jul 31, 2014 at 2:07 PM, Ameya Aware <ameya.aw...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> >>>>>>> I am using filesystem connector to index my entire C drive using >>>>>>> Solr as output connector. >>>>>>> >>>>>>> Initial 100000 documents were crawled and indexed successfully in >>>>>>> couple of hours but after that indexing slowed down badly (around 15-20 >>>>>>> documents per min). >>>>>>> >>>>>>> >>>>>>> I am not able to figure out whether there is issue with MCF or Solr. >>>>>>> >>>>>>> >>>>>>> Can you advice me how to proceed with this? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Ameya >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >