D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-11-13 Thread Igor Poboiko
poboiko edited the summary of this revision. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D23787 To: poboiko, #baloo, bruns, ngraham Cc: davidedmundson, broulik, kde-frameworks-devel, #baloo, hurikhan77, lots0logs, LeGast00n, fbampaloukas, GB_2, domson, ashaposhnikov,

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-11-13 Thread Igor Poboiko
poboiko added a comment. @bruns: I've missed D16593: [ExtractorCollection] Use only best matching extractor plugin , and had in mind previous situation where we've matched all extractors based on inheritance. In that case, "Secondly" part indeed does not

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-11-13 Thread Stefan Brüns
bruns added a comment. In D23787#541963 , @poboiko wrote: > In D23787#537891 , @bruns wrote: > > > Can you please provide an example which: > > > > - is currently indexed though it should be

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-11-13 Thread Stefan Brüns
bruns added a comment. In D23787#541963 , @poboiko wrote: > > and another example which: > > > > - is currently skipped though it should be indexed > > - is indexed after this change > > There shouldn't be any. I mean,

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-11-13 Thread Igor Poboiko
poboiko added a comment. Ping? REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D23787 To: poboiko, #baloo, bruns, ngraham Cc: davidedmundson, broulik, kde-frameworks-devel, #baloo, hurikhan77, lots0logs, LeGast00n, fbampaloukas, GB_2, domson, ashaposhnikov, michaelh,

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-10-04 Thread Igor Poboiko
poboiko updated this revision to Diff 67318. poboiko added a comment. Minor optimization: change check order (filesize / extractor property) Also, it should be better check by "Id" instead of "Name" REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-10-04 Thread Igor Poboiko
poboiko marked an inline comment as done. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D23787 To: poboiko, #baloo, bruns, ngraham Cc: davidedmundson, broulik, kde-frameworks-devel, #baloo, lots0logs, LeGast00n, fbampaloukas, GB_2, domson, ashaposhnikov, michaelh,

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-10-04 Thread David Edmundson
davidedmundson added inline comments. INLINE COMMENTS > app.cpp:173 > -if (fileInfo.size() >= 10 * 1024 * 1024) { > -tr->removePhaseOne(id); > -return; This original line seemed very very wrong. Just because we won't want to index phase 2 isn't a reason to

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-10-04 Thread Nathaniel Graham
ngraham added inline comments. INLINE COMMENTS > poboiko wrote in app.cpp:185 > I though users might actually want to know if file was excluded (and the > reasoning behind it). > I can make its severity less, i.e. `qCDebug`. Or you think it should be > completely removed? Users don't read log

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-10-04 Thread Igor Poboiko
poboiko added inline comments. INLINE COMMENTS > bruns wrote in app.cpp:185 > Users will love us for spammig the logs ... I though users might actually want to know if file was excluded (and the reasoning behind it). I can make its severity less, i.e. `qCDebug`. Or you think it should be

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-10-04 Thread Igor Poboiko
poboiko added a comment. In D23787#537891 , @bruns wrote: > Can you please provide an example which: > > - is currently indexed though it should be skipped due to size > - is skipped after this change Sure. Any mimetype inherited

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-09-25 Thread Stefan Brüns
bruns requested changes to this revision. bruns added a comment. This revision now requires changes to proceed. Can you please provide an example which: - is currently indexed though it should be skipped due to size - is skipped after this change and another example which: -

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-09-25 Thread Nathaniel Graham
ngraham accepted this revision. This revision is now accepted and ready to land. REPOSITORY R293 Baloo BRANCH improve-large-text-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D23787 To: poboiko, #baloo, bruns, ngraham Cc: broulik, kde-frameworks-devel, #baloo,

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-09-25 Thread Igor Poboiko
poboiko marked an inline comment as done. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D23787 To: poboiko, #baloo, bruns, ngraham Cc: broulik, kde-frameworks-devel, #baloo, lots0logs, LeGast00n, fbampaloukas, GB_2, domson, ashaposhnikov, michaelh, astippich, spoorun,

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-09-25 Thread Igor Poboiko
poboiko updated this revision to Diff 66805. poboiko added a comment. Address raised issue: fetch file size only once REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D23787?vs=65646=66805 BRANCH improve-large-text-files (branched from master) REVISION

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-09-08 Thread Kai Uwe Broulik
broulik added inline comments. INLINE COMMENTS > app.cpp:183 > +// have trouble processing them > +if ((ex->extractorProperties()["Name"].toString() == > QLatin1String("PlaintextExtractor")) && (QFileInfo(url).size() >= 10 * 1024 * > 1024)) { > +qCWarning(BALOO) <<

D23787: [baloo_file_extractor] Improve handling of large plain-text files

2019-09-08 Thread Igor Poboiko
poboiko created this revision. poboiko added reviewers: Baloo, bruns, ngraham. Herald added projects: Frameworks, Baloo. poboiko requested review of this revision. REVISION SUMMARY First of all, not all plain text-based mimetypes starts with `text/`: i.e. `application/sql` for SQL dumps