What was wrong in my example which meant this? Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E", "E", "E", "F"] Dim s As String For Each s In ReturnArrays(sSort, 0) Print s Next For Each s In ReturnArrays(sSort, -1) Print s Next End Private Function ReturnArrays(SortedArray As String[], withNumber As Boolean) As String[] Dim sSingle, sWithNumber As New String[] Dim i, n As Integer For i = 0 To SortedArray.Max ' You can avoid with Tobias's trick (For i = 1 To ...) If i < SortedArray.Max Then If SortedArray[i] = SortedArray[i + 1] Then Inc n Else Inc n sSingle.Push(SortedArray[i]) sWithNumber.Push(n & SortedArray[i]) n = 0 Endif Endif Next Inc n sSingle.Push(SortedArray[SortedArray.Max]) sWithNumber.Push(n & SortedArray[SortedArray.Max]) If withNumber Then Return sWithNumber Else Return sSingle Endif End Regards Gianluigi 2017-06-30 15:05 GMT+02:00 Tobias Boege <tabo...@gmail.com>: > On Fri, 30 Jun 2017, Fernando Cabral wrote: > > 2017-06-30 7:44 GMT-03:00 Fabien Bodard <gambas...@gmail.com>: > > > > > The best way is the nando one ... at least for gambas. > > > > > > As you have not to matter about what is the index value or the order, > > > the walk ahead option is the better. > > > > > > > > > Then Fernando ... for big, big things... I think you need to use a DB. > > > Or a native language.... maybe a sqlite memory structure can be good. > > > > > > > Fabien, since this is a one-time only thing, I don't think I'd be better > > off witha database. > > Basically, I read a text file an then break it down into words, sentences > > and paragraphs. > > Next I count the items in each array (words, sentences paragraphs). > > Array.count works wonderfully. > > After that, have to eliminate the duplicate words (Array.words). But in > > doing it, al also have to count > > how many times each word appeared. > > > > Finally I sort the Array.Sentences and the Array.Paragraphs by size > > (string.len()). The Array.WOrds are > > sorted by count + lenght. This is all woring good. > > > > So, my quest is for the fastest way do eliminate the words duplicates > while > > I count them. > > For the time being, here is a working solution based on system' s sort | > > uniq: > > > > Here is one of the versions I have been using: > > > > Exec ["/usr/bin/uniq", "Unsorted.txt", "Sorted.srt2"] Wait > > Exec ["/usr/bin/uniq", "-ci", "SortedWords.srt2", SortedWords.srt3"] > Wait > > Exec ["/usr/bin/sort", "-bnr", SortedWords.srt3] To UniqWords > > > > Are those temporary files? You can avoid those by piping your data into the > processes and reading their output directly. Otherwise the Temp$() function > gives you better temporary files. > > > WordArray = split (UniqWords, "\n") > > > > So, I end up with the result I want. It's effective. Now, it would be > more > > elegant If I could do the same > > with Gambas. Of course, the sorting would be easy with the builting > > WordArray.sort (). > > But how about te '"/usr/bin/uniq", "-ci" ...' part? > > > > I feel like my other mail answered this, but I can give you another version > of that routine (which I said I would leave as an exercise to you): > > ' Remove duplicates in an array like "uniq -ci". String comparison is > ' case insensitive. The i-th entry in the returned array counts how many > ' times aStrings[i] (in the de-duplicated array) was present in the > input. > ' The data in ~aStrings~ is overridden. Assumes the array is sorted. > Private Function Uniq(aStrings As String[]) As Integer[] > Dim iSrc, iLast As Integer > Dim aCount As New Integer[](aStrings.Count) > > If Not aStrings.Count Then Return [] > iLast = 0 > aCount[iLast] = 1 > For iSrc = 1 To aStrings.Max > If String.Comp(aStrings[iSrc], aStrings[iLast], gb.IgnoreCase) Then > Inc iLast > aStrings[iLast] = aStrings[iSrc] > aCount[iLast] = 1 > Else > Inc aCount[iLast] > Endif > Next > > ' Now shrink the arrays to the memory they actually need > aStrings.Resize(iLast + 1) > aCount.Resize(iLast + 1) > Return aCount > End > > What, in my opinion, is at least theoretically better here than the other > proposed solutions is that it runs in linear time, while nando's is > quadratic[*]. (Of course, if you sort beforehand, it will become n*log(n), > which is still better than quadratic.) > > Attached is a test script with some words. It runs the sort + uniq > utilities > first and then Array.Sort() + the Uniq() function above. The program then > prints the *diff* between the two outputs. I get an empty diff, meaning > that > my Gambas routines produce exactly the same output as the shell utilities. > > Regards, > Tobi > > [*] He calls array functions Add() and Find() inside a For loop that runs > over an array of size n. Adding elements to an array or searching an > array have themselves worst-case linear complexity, giving quadratic > overall. My implementation reserves some more space in advance to > avoid calling Add() in a loop. Since the array is sorted, we can go > without Find(), too. Actually, as you may know, adding an element to > the end of an array can be implemented in amortized constant time > (as C++'s std::vector does), by wasting space, but AFAICS Gambas > doesn't do this, but I could be wrong. > > -- > "There's an old saying: Don't change anything... ever!" -- Mr. Monk > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Gambas-user mailing list > Gambas-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gambas-user > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Gambas-user mailing list Gambas-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gambas-user