I think I know why it works faster, cause VectorSource in above code only
takes the files names as a corpus and not the contents of the files :D duh!

Any suggestions to create a vector source out of contents of the txt files ?


On Sat, Aug 17, 2013 at 1:59 PM, Ajinkya Kale <kaleajin...@gmail.com> wrote:

> Funny, it works fine if I use VectorSource
> ovid <- Corpus(VectorSource(list.files(sourceDir)[1:1253]), readerControl
> = list(language = "lat"))
> So I tried only executing > DirDource(sourceDir) and that fails with the
> error i mentioned earlier. So its not a problem with Corpus() which I
> thought initially it was.
>
> Also, I noticed that VectorSource works way more faster than having a
> DirSource there.
> Any particular reason ?
>
>
> On Sat, Aug 17, 2013 at 11:16 AM, Ajinkya Kale <kaleajin...@gmail.com>wrote:
>
>> It contains all text files which were converted from doc, docx, ppt etc.
>> using libreoffice.
>> Some of them are non-english text documents.
>>
>> Sorry I cannot share the corpus.. but if someone can shed light on what
>> might cause this error then I can try to eliminate those documents if some
>> specific docs are causing it.
>>
>>
>> On Sat, Aug 17, 2013 at 9:55 AM, Milan Bouchet-Valat 
>> <nalimi...@club.fr>wrote:
>>
>>> Le vendredi 16 août 2013 à 19:35 -0700, Ajinkya Kale a écrit :
>>> > I am trying to use the text mining package ... I keep getting this
>>> error :
>>> >
>>> > rm(list=ls())
>>> > library(tm)
>>> > sourceDir <- "Z:\\projectk_viz\\docs_to_index"
>>> > ovid <- Corpus(DirSource(sourceDir),readerControl = list(language =
>>> "lat"))
>>> >
>>> > Error in if (vectorized && (length <= 0)) stop("vectorized sources must
>>> > have positive length") : missing value where TRUE/FALSE needed
>>> >
>>> > I am not sure what it means.
>>> The posting guide asks for a reproducible example. If you cannot make
>>> available to us the contents of sourceDir, at least you should tell us
>>> what kind of files it contains. Have you tried with only some of the
>>> files the directory contains ?
>>>
>>>
>>> Regards
>>>
>>> > --ajinkya
>>> >
>>> >       [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>> --
>>
>> Sincerely,
>> Ajinkya
>> http://ajinkya.info
>>
>
>
>
> --
>
> Sincerely,
> Ajinkya
> http://ajinkya.info
>



-- 

Sincerely,
Ajinkya
http://ajinkya.info

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to