Re: Limit on input PDF file size in Tika?

2017-06-08 Thread tesm...@gmail.com
) { System.out.println("Exception caught:"); } //Convert the body handler to string and return the string to the calling function return handler.toString(); } Regards, On Thu, Jun 8, 2017 at 4:29 PM, Nick Burch <apa...@gagravarr.org> wrote: > On Thu, 8 Jun 2017, tesm...@gmail.com

Grobid with TXT and HTML files

2017-06-08 Thread tesm...@gmail.com
https://github.com/USCDataScience/parser-indexer- > py/tree/master/parser-server > [4] https://github.com/USCDataScience/parser-indexer- > py/blob/master/docs/parser-index-journals.md > > *--* > *Thamme Gowda* > TG | @thammegowda <https://twitter.com/thammegowda> >

Reading PDF/text/word file efficiently with Spark

2017-05-19 Thread tesm...@gmail.com
Hi, I am doing NLP (Natural Language Processing) processing on my data. The data is in form of files that can be of type PDF/Text/Word/HTML. These files are stored in a directory structure on my local disk, even nested directories. My stand alone Java based NLP parser can read input files, extract

Re: Analysing a document sections with Apache Tika

2017-05-04 Thread tesm...@gmail.com
> Thamme > > [1] http://grobid.readthedocs.io/en/latest/Introduction/ > [2] https://wiki.apache.org/tika/GrobidJournalParser > [3] https://github.com/USCDataScience/parser-indexer- > py/tree/master/parser-server > [4] https://github.com/USCDataScience/parser-indexer- > py

unsubscribe

2017-04-12 Thread tesm...@gmail.com
unsubscribe

Exception while creating a HttpSolrClinet

2016-12-15 Thread tesm...@gmail.com
Hi, I am getting the following exception while creating a Solr client. Any help is appreciated =This is code snipper to create a SolrClient=== public void populate (String args) throws IOException, SolrServerException { String urlString = "http://localhost:8983/solr;;

Solr+Solarium deployment on Azure - Best practices

2016-11-29 Thread tesm...@gmail.com
Hi, I am deploying a search engine on Azure. The following is my configuration: Solr server is running on Ubuntu VM (hosted on Azure) PHP web app is hosted on Azure using the same VM hosting Solr server. Is there any best practices/approach guidelines? I am getting the following exception:

HTTP Request timeout exception with Solr+Solarium on Azure

2016-11-29 Thread tesm...@gmail.com
Hi, I am deploying Solr+PHPSolarium on Azure Solr server is running in a Ubuntu VM on Azure. Php pages PHPSolarium are hosted as webapp using the same VM as for Solr server. After deployment, I am getting the following HTTP request timeout error: Fatal error: Uncaught exception

Re: Custom .... - Web toolkit for developing Solr Client application

2016-11-07 Thread tesm...@gmail.com
6, at 14:01, "tesm...@gmail.com" <tesm...@gmail.com> wrote: > > > > Hi, > > > > My search query comprises of more than one fields like search string, > date > > field and a one optional field). > > > > I need to represent these on the we

Re: Custom user web interface for Solr

2016-11-07 Thread tesm...@gmail.com
wrote: > What kind of graphical format? > > > On Nov 4, 2016, at 14:01, "tesm...@gmail.com" <tesm...@gmail.com> wrote: > > > > Hi, > > > > My search query comprises of more than one fields like search string, > date > > field and

Custom user web interface for Solr

2016-11-04 Thread tesm...@gmail.com
Hi, My search query comprises of more than one fields like search string, date field and a one optional field). I need to represent these on the web interface to the users. Secondly, I need to represent the search data in graphical format. Is there some Solr web client that provides the above

Re: Combine Data from PDF + XML

2016-10-26 Thread tesm...@gmail.com
to define the problem > > what do you mean by "combine"? Do the XML files > contain, say, metadata about an associated PDF file? > > Or are these entirely orthogonal documents that > you need to index into the same collection? > > Best, > Erick > > On T

Combine Data from PDF + XML

2016-10-25 Thread tesm...@gmail.com
Hi, I ma new to Apache Solr. Developing a search project. The source data is coming from two sources: 1) XML Files 2) PDF Files I need to combine these two sources for search. Couldn't find example of combining these two sources. Any help is appreciated. Regards,

Jar for Spark developement

2016-06-21 Thread tesm...@gmail.com
Hi, Beginner in Spark development. Took time to configure Eclipse + Scala. Is there any tutorial that can help beginners. Still struggling to find Spark JAR files for development. There is no lib folder in my Spark distribution (neither in pre-built nor in custom built..) Regards,

Re: video stream as input to sequence files

2015-03-10 Thread tesm...@gmail.com
, tesm...@gmail.com tesm...@gmail.com wrote: Dear Daemeon, Thanks for your rpely. Here is my flow. I am processing video frames using MapReduce. Presently, I convert the video files to individual framess, make a sequence file out of them and transfer the sequence file to HDFS. This flow

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

2015-03-07 Thread tesm...@gmail.com
Dear Jonathan, Would you please describe the process of running EMR based Hadoop for $15.00, I tried and my cost were rocketing like $60 for one hour. Regards On 05/03/2015 23:57, Jonathan Aquilina wrote: krish EMR wont cost you much with all the testing and data we ran through the test

Re: How to resolve--- Unauthorized request to start container. This token is expired.

2015-02-27 Thread tesm...@gmail.com
ntp: http://www.cyberciti.biz/faq/debian-ubuntu-linux-install-ntpd/ . When you have finished these steps you can check the system’s clocks using the ‘date’ command’. The differences between the servers should be minimal. Regards, Jan On 26 Feb 2015, at 19:19, tesm...@gmail.com wrote

Re: How to resolve--- Unauthorized request to start container. This token is expired.

2015-02-26 Thread tesm...@gmail.com
: Could you check for any time differences between your servers? If so, please install and run NTP, and retry your job. Regards, Jan On 26 Feb 2015, at 17:57, tesm...@gmail.com wrote: I am getting Unauthorized request to start container. This token is expired. How to resovle it. The problem

Re: How to resolve--- Unauthorized request to start container. This token is expired.

2015-02-26 Thread tesm...@gmail.com
should be minimal. Regards, Jan On 26 Feb 2015, at 19:19, tesm...@gmail.com wrote: Thanks Jan. I did the follwoing: 1) Manually set the timezone of all the nodes using sudo dpkg-reconfigure tzdata 2) Re-booted the nodes Still having the same exception. How can I configure NTP

Re: java.net.UnknownHostException on one node only

2015-02-25 Thread tesm...@gmail.com
Thanks Varun, Where shall I check to resolve it? Regards, Tariq On Mon, Feb 23, 2015 at 4:07 AM, Varun Kumar varun@gmail.com wrote: Hi Tariq, Issues looks like DNS configuration issue. On Sun, Feb 22, 2015 at 3:51 PM, tesm...@gmail.com tesm...@gmail.com wrote: I am getting

HDFS data after nodes become unavailable?

2015-02-25 Thread tesm...@gmail.com
Dear all, I have transferred the data from local storage to HDFS in my 10 nodes Hadoop cluster. The relication facotr is 3. Some nodes, say 3, are not available after some time. I can't use those nodes for computation or storage of data. What will happen to the data stored on HDFS of those

video stream as input to sequence files

2015-02-25 Thread tesm...@gmail.com
Hi, How can I make my video data files as input for sequence file or to HDFS directly. Regards, Tariq

Re: video stream as input to sequence files

2015-02-25 Thread tesm...@gmail.com
,thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Wed, Feb 25, 2015 at 4:01 PM, tesm...@gmail.com tesm...@gmail.com wrote: Hi, How can I make my video data files as input

java.net.UnknownHostException on one node only

2015-02-22 Thread tesm...@gmail.com
I am getting java.net.UnknownHost exception continuously on one node Hadoop MApReduce execution. That node is accessible via SSH. This node is shown in yarn node -list and hadfs dfsadmin -report queries. Below is the log from execution 15/02/22 20:17:42 INFO mapreduce.Job: Task Id :

Running MapReduce jobs in batch mode on different data sets

2015-02-21 Thread tesm...@gmail.com
Hi, Is it possible to run jobs on Hadoop in batch mode? I have 5 different datasets in HDFS and need to run the same MapReduce application on these datasets sets one after the other. Right now I am doing it manually How can I automate this? How can I save the log of each execution in text

Scheduling in YARN according to available resources

2015-02-20 Thread tesm...@gmail.com
I have 7 nodes in my Hadoop cluster [8GB RAM and 4VCPUs to each nodes], 1 Namenode + 6 datanodes. I followed the link from Hortonwroks [ http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html ] and made calculation according to the hardware

YARN container lauch failed exception and mapred-site.xml configuration

2015-02-20 Thread tesm...@gmail.com
I have 7 nodes in my Hadoop cluster [8GB RAM and 4VCPUs to each nodes], 1 Namenode + 6 datanodes. **EDIT-1@ARNON:** I followed the link, mad calculation according to the hardware configruation on my nodes and have added the update mapred-site and yarn-site.xml files in my question. Still my

Re: Scheduling in YARN according to available resources

2015-02-20 Thread tesm...@gmail.com
ravishankar.n...@gmail.com wrote: I had an issue very similar, I changed and used Oracle JDK. There is nothing I see wrong with your configuration in my first look, thanks Regards, Nair On Sat, Feb 21, 2015 at 1:42 AM, tesm...@gmail.com tesm...@gmail.com wrote: I have 7 nodes in my Hadoop cluster

Fwd: YARN container lauch failed exception and mapred-site.xml configuration

2015-02-20 Thread tesm...@gmail.com
I have 7 nodes in my Hadoop cluster [8GB RAM and 4VCPUs to each nodes], 1 Namenode + 6 datanodes. I followed the link o horton works [ http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html] and made calculation according to the hardware

Re: Reconfiguration Problem

2014-08-06 Thread tesm...@gmail.com
, Richard Heck rgh...@lyx.org wrote: On 08/05/2014 11:45 AM, tesm...@gmail.com wrote: Dear Richard, I get the following error message while Lyx Reconfiguration: The script '/usr/share/lyx/scriptes/TexFile.py' failed Secondly, All the document classes are unavailable in my Lyx installation I

Re: Reconfiguration Problem

2014-08-06 Thread tesm...@gmail.com
PM, Richard Heck rgh...@lyx.org wrote: On 08/06/2014 04:52 AM, tesm...@gmail.com wrote: Dear Richard, Attached is console log for lyx. Starting Lyx from console solved the re-configuration errors. Reconfiguration still fails when Lyx is started directly from GUI. That probably means

Re: Reconfiguration Problem

2014-08-06 Thread tesm...@gmail.com
, Richard Heck <rgh...@lyx.org> wrote: > On 08/05/2014 11:45 AM, tesm...@gmail.com wrote: > >> Dear Richard, >> >> I get the following error message while Lyx Reconfiguration: >> >> The script '/usr/share/lyx/scriptes/TexFile.py' failed >> >> Se

Re: Reconfiguration Problem

2014-08-06 Thread tesm...@gmail.com
PM, Richard Heck <rgh...@lyx.org> wrote: > On 08/06/2014 04:52 AM, tesm...@gmail.com wrote: > >> Dear Richard, >> >> Attached is console log for lyx. >> >> Starting Lyx from console solved the re-configuration errors. >> Reconfiguration s

Export to OpenOffice in Lyx2.1.1

2014-08-04 Thread tesm...@gmail.com
Hi, I exported Lyx document to OpenOffice documents about an year ago. I recently updated to Lyx 2.1.1. When I try to export my .lyx file to OpenOffice formate; I get the message NO document support is availabel for this format Is this support discontinued in Lyx 2.1.1.? Regards,

Export to OpenOffice in Lyx2.1.1

2014-08-04 Thread tesm...@gmail.com
Hi, I exported Lyx document to OpenOffice documents about an year ago. I recently updated to Lyx 2.1.1. When I try to export my .lyx file to OpenOffice formate; I get the message NO document support is availabel for this format Is this support discontinued in Lyx 2.1.1.? Regards,

Export to OpenOffice in Lyx2.1.1

2014-08-04 Thread tesm...@gmail.com
Hi, I exported Lyx document to OpenOffice documents about an year ago. I recently updated to Lyx 2.1.1. When I try to export my .lyx file to OpenOffice formate; I get the message "NO document support is availabel for this format" Is this support discontinued in Lyx 2.1.1.? Regards,