Hi Rakesh, Got it. Can you please submit a PR for your simple fix? I’ll happily credit you and it would be great for the Tika community.
Thanks! -C ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ On 6/10/16, 9:11 AM, "Rakesh Kumar" <iitd.tani...@gmail.com> wrote: >No, I corrected the error, it was a small mistake on your side. Problem was >trailing forward slash after url . Please see the explanation below > > > > > http://www.assignmenthelp.net/contactus > will be saved as /Temp/contactus > http://www.assignmenthelp.net/abc.pdf > will be saved as /Temp/abc.pdf > > >What about, http://www.assignmenthelp.net/ > => you are trying to save it as /tmp/<Null> hence the error >However if there is no "/" at the end of url i.e. >http://www.assignmenthelp.net <http://www.assignmenthelp.net/> then > you try to save it as \Temp/www.assignmenthelp.net > <http://www.assignmenthelp.net> > > > >Hence small correction was to remove "/" if it is there at the end of url . > > >Rest everything is ok. > > >On Fri, Jun 10, 2016 at 8:38 PM, Mattmann, Chris A (3980) ><chris.a.mattm...@jpl.nasa.gov> wrote: > >[moved to dev@tika.a.o list please follow replies there.] > >Rakesh - looks like you don’t have permissions to write to >your temp dir on Windows. Can you confirm that’s the case? > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: chris.a.mattm...@nasa.gov >WWW: >http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/~mattmann/> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Director, Information Retrieval and Data Science Group (IRDS) >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >WWW: http://irds.usc.edu/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > >On 6/10/16, 2:01 AM, "Rakesh Kumar" <iitd.tani...@gmail.com> wrote: > >>Hi when I try to extract content from url it prints error >> >> >> >> >>tika.py: Retrieving >http://www.assignmenthelp.net/ <http://www.assignmenthelp.net/> to >C:\Users\Rakesh\AppData\Local\Temp/ >>Traceback (most recent call last): >>File "C:\Users\Rakesh\Anaconda3\lib\site-packages\tika\tika.py", line 368, in >>getRemoteFile >>urlretrieve(urlOrPath, destPath) >>File "C:\Users\Rakesh\Anaconda3\lib\urllib\request.py", line 197, in >>urlretrieve >>tfp = open(filename, 'wb') >>PermissionError: [Errno 13] Permission denied: >>'C:\Users\Rakesh\AppData\Local\Temp/' >>During handling of the above exception, another exception occurred: >>Traceback (most recent call last): >>File "prg.py", line 7, in >>parsed = parser.from_file(fileUrl, tikServer) >>File "C:\Users\Rakesh\Anaconda3\lib\site-packages\tika\parser.py", line 25, >>in from_file >>jsonOutput = parse1('all', filename, serverEndpoint) >>File "C:\Users\Rakesh\Anaconda3\lib\site-packages\tika\tika.py", line 184, in >>parse1 >>path, file_type = getRemoteFile(urlOrPath, TikaFilesPath) >>File "C:\Users\Rakesh\Anaconda3\lib\site-packages\tika\tika.py", line 378, in >>getRemoteFile >>urlretrieve(urlOrPath, destPath) >>File "C:\Users\Rakesh\Anaconda3\lib\urllib\request.py", line 197, in >>urlretrieve >>tfp = open(filename, 'wb') >>PermissionError: [Errno 13] Permission denied: >>'C:\Users\Rakesh\AppData\Local\Temp/' >> >> > > > > > > >