A few remarks:

-- I think, (CSV/Java) Stream API is not suitable directly, it is difficult
to implement sorting just using streams. If you can suggest a simpler
solution, I will really appreciate it because the simpler the code, the
better. In the library, of course, streams (channels) and streams (data
flows) are used.
-- No matter how it may seem, the problem is not so trivial, although it
sounds simple. We should care about memory, diskspace and performance. The
investigation of code and commits will give an opportunity to estimate the
scope of the solution. -- The library is designed to work with any files if
they are divisible by some separator (this is the main purpose of the
library, there are also some other utilities)
-- It is not very clear to me why the possibility to sort the files lists
is suitable for commons-io, but the content-sorting functionality is not.
-- It would be just great if you could hint where else this functionality
could be included. I am not inclined to insist on commons-io, but I'm sure
having such functionality somewhere in a well-known place will save other
developers time. It's strange that there is no such functionality anywhere
yet (or maybe I couldn't find it). Of course, this is used in databases and
other serious frameworks, but sometimes we don't want to mess with heavy
dependencies for the sake of a few lines of code.



On Sun, Jul 9, 2023 at 11:07 PM Gary Gregory <garydgreg...@gmail.com> wrote:

> Commons CSV supports the Java Streaming API so you can do whatever that API
> offers,  including filtering, sorting, finding, and so on.
>
> More than plain CSVs are supported, and I encourage you to peruse the site
> https://commons.apache.org/proper/commons-csv/
>
> If you think that component can be enhanced, feel free to keep the
> conversation going with a more specific proposal.
>
> WRT Commons IO, it seems to me that IO is a lower level component and does
> not match your offering and that Commons CSV might too much toward CSV
> files. OTHO, it does not seem like what you propose would be generic enough
> to parse any binary file, say an old school dBASE file, but I could be
> wrong.
>
> Gary
>
> On Sun, Jul 9, 2023, 13:35 ssz <sss.z...@gmail.com> wrote:
>
> > Does common-csv support **sorting** large?
> > Does it support binary search?
> > What should I do if I have a non-csv text file?
> >
> > Actually I didn't say that textfile-utils is a library for working with
> csv
> > files.
> > I just provided you with an example.
> >
> >
> >
> >
> > On Sun, Jul 9, 2023 at 8:23 PM Gary Gregory <garydgreg...@gmail.com>
> > wrote:
> >
> > > If the intent is to process CSV files, you're missing quite parameters
> in
> > > order to process all of the different CSV flavors, see Apache Commons
> > CSV.
> > >
> > > Gary
> > >
> > >
> > > On Sun, Jul 9, 2023, 13:16 ssz <sss.z...@gmail.com> wrote:
> > >
> > > > text-files sort. e.g. CSV.
> > > >
> > > > Example:
> > > > content: `d,420;b,42;b,21;a;21;c;"42"`, delimiter ';'
> > > > after sort by prefix: `a:21;b,42;b,21;c:"42";d,420`
> > > > binary search by prefix `b`: `b,42;b,21`
> > > >
> > > > The project is completed with tests and documentation.
> > > > It is open source.
> > > > Github: https://github.com/DataFabricRus/textfile-utils
> > > >
> > > > I think there shouldn't be any problems with reading the code.
> > > > Kotlin - is advanced java, or you can consider it as pseudocode.
> > > >
> > > > Perhaps I should supplement the description in `README.md` to make it
> > > > clearer?
> > > > Could you please tell me what I should include?
> > > >
> > > > Yes, many databases have sorted files under the hood.
> > > > But what should I do if I need to just search in a big file?
> > > > I can't reuse database code, I can't make a particular trivial task
> > more
> > > > complicated by using a database. I haven't been able to find any good
> > > > solutions in regular libraries.
> > > > So I invented this bicycle.I think the desire to have such a library
> is
> > > > understandable.
> > > >
> > > > Please ask any questions.
> > > >
> > > >
> > > > On Sun, Jul 9, 2023 at 6:40 PM Gary Gregory <garydgreg...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > This seems to be me like a mismatch with Commons IO.
> > > > >
> > > > > What does it even mean to "sort" a file which are really a bunch of
> > > > bytes.
> > > > > Do you have a relevant example (Java base)?
> > > > >
> > > > > This feels more like a database primitive to me. What am I missing?
> > > > >
> > > > > Gary
> > > > >
> > > > > On Sun, Jul 9, 2023, 10:42 ssz <sss.z...@gmail.com> wrote:
> > > > >
> > > > > > It seems to be well-known and generic functionality, so it would
> be
> > > > nice
> > > > > to
> > > > > > have it in some well-known common place.
> > > > > > Is *apache/commons-io* this place?
> > > > > >
> > > > > > Here is the draft:
> https://github.com/DataFabricRus/textfile-utils
> > > > > > This is my library made for DataFablic, it is written on kotlin
> > with
> > > > > > coroutines and Java NIO.
> > > > > > Of course, it can be ported to java (preserving kotlin-version
> for
> > > > > > multiplatform)
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to