On Thu, Oct 16, 2014 at 2:15 PM, Martin Gräßlin <mgraess...@kde.org> wrote:

> the txt being genome data doesn't surprise me[1], but I find it sad that
> now
> txt is disabled by default (I use them quite a lot for blog posts). As
> genome
> data is really huge wouldn't it make sense to go rather for file size or
> abort
> the indexing if it's obvious random gibberish?

We currently have a hard limit of 50mb on 'text/plain' files. However this
does not include log files, which have a separate mimetype, Perhaps it
would really be good to reduce it to about 5 mb.

About gibberish. It's hard to figure out what gibberish is. I think I'll
add some code that we only index the first 20 characters of each word. That
should help to a certain extent.

Vishesh Handa
Plasma-devel mailing list

Reply via email to