Thanks Cheng !!
On Thu, Apr 24, 2014 at 5:43 PM, Cheng Lian <lian.cs....@gmail.com> wrote: > You may try this: > > val lastOption = sc.textFile("input").mapPartitions { iterator => > if (iterator.isEmpty) { > iterator > } else { > Iterator > .continually((iterator.next(), iterator.hasNext())) > .collect { case (value, false) => value } > .take(1) > } > }.collect().lastOption > > Iterator based data access ensures O(1) space complexity and it runs > faster because different partitions are processed in parallel. lastOptionis > used instead of > last to deal with empty file. > > > On Thu, Apr 24, 2014 at 7:38 PM, Sai Prasanna <ansaiprasa...@gmail.com>wrote: > >> Hi All, Finally i wrote the following code, which is felt does optimally >> if not the most optimum one. >> Using file pointers, seeking the byte after the last \n but backwards !! >> This is memory efficient and i hope even unix tail implementation should >> be something similar !! >> >> import java.io.RandomAccessFile >> import java.io.IOException >> var FILEPATH="/home/sparkcluster/hadoop-2.3.0/temp"; >> var fileHandler = new RandomAccessFile( FILEPATH, "r" ); >> var fileLength = fileHandler.length() - 1; >> var cond = 1; >> var filePointer = fileLength-1; >> var toRead= -1; >> while(filePointer != -1 && cond!=0){ >> fileHandler.seek( filePointer ); >> var readByte = fileHandler.readByte(); >> if( readByte == 0xA && filePointer != fileLength ) >> cond=0; >> else if( readByte == 0xD && filePointer != fileLength >> - 1 ) cond=0; >> >> filePointer=filePointer-1; toRead=toRead+1; >> } >> filePointer=filePointer+2; >> var bytes : Array[Byte] = new Array[Byte](toRead); >> fileHandler.seek(filePointer); >> fileHandler.read(bytes); >> var bdd=new String(bytes); /*bdd contains the last line*/ >> >> >> >> >> On Thu, Apr 24, 2014 at 11:42 AM, Sai Prasanna >> <ansaiprasa...@gmail.com>wrote: >> >>> Thanks Guys ! >>> >>> >>> On Thu, Apr 24, 2014 at 11:29 AM, Sourav Chandra < >>> sourav.chan...@livestream.com> wrote: >>> >>>> Also same thing can be done using rdd.top(1)(reverseOrdering) >>>> >>>> >>>> >>>> On Thu, Apr 24, 2014 at 11:28 AM, Sourav Chandra < >>>> sourav.chan...@livestream.com> wrote: >>>> >>>>> You can use rdd.takeOrdered(1)(reverseOrdrering) >>>>> >>>>> reverseOrdering is you Ordering[T] instance where you define the >>>>> ordering logic. This you have to pass in the method >>>>> >>>>> >>>>> >>>>> On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft < >>>>> fnoth...@berkeley.edu> wrote: >>>>> >>>>>> If you do this, you could simplify to: >>>>>> >>>>>> RDD.collect().last >>>>>> >>>>>> However, this has the problem of collecting all data to the driver. >>>>>> >>>>>> Is your data sorted? If so, you could reverse the sort and take the >>>>>> first. Alternatively, a hackey implementation might involve a >>>>>> mapPartitionsWithIndex that returns an empty iterator for all partitions >>>>>> except for the last. For the last partition, you would filter all >>>>>> elements >>>>>> except for the last element in your iterator. This should leave one >>>>>> element, which is your last element. >>>>>> >>>>>> Frank Austin Nothaft >>>>>> fnoth...@berkeley.edu >>>>>> fnoth...@eecs.berkeley.edu >>>>>> 202-340-0466 >>>>>> >>>>>> On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob <nsyaq...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> This function will return scala List, you can use List's last >>>>>> function to get the last element. >>>>>> >>>>>> For example: >>>>>> >>>>>> RDD.take(RDD.count()).last >>>>>> >>>>>> >>>>>> On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna < >>>>>> ansaiprasa...@gmail.com> wrote: >>>>>> >>>>>>> Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD. >>>>>>> >>>>>>> I want only to access the last element. >>>>>>> >>>>>>> >>>>>>> On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna < >>>>>>> ansaiprasa...@gmail.com> wrote: >>>>>>> >>>>>>>> Oh ya, Thanks Adnan. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob >>>>>>>> <nsyaq...@gmail.com>wrote: >>>>>>>> >>>>>>>>> You can use following code: >>>>>>>>> >>>>>>>>> RDD.take(RDD.count()) >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna < >>>>>>>>> ansaiprasa...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi All, Some help ! >>>>>>>>>> RDD.first or RDD.take(1) gives the first item, is there a >>>>>>>>>> straight forward way to access the last element in a similar way ? >>>>>>>>>> >>>>>>>>>> I coudnt fine a tail/last method for RDD. !! >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Sourav Chandra >>>>> >>>>> Senior Software Engineer >>>>> >>>>> · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · >>>>> >>>>> sourav.chan...@livestream.com >>>>> >>>>> o: +91 80 4121 8723 >>>>> >>>>> m: +91 988 699 3746 >>>>> >>>>> skype: sourav.chandra >>>>> >>>>> Livestream >>>>> >>>>> "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, >>>>> 3rd Block, Koramangala Industrial Area, >>>>> >>>>> Bangalore 560034 >>>>> >>>>> www.livestream.com >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Sourav Chandra >>>> >>>> Senior Software Engineer >>>> >>>> · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · >>>> >>>> sourav.chan...@livestream.com >>>> >>>> o: +91 80 4121 8723 >>>> >>>> m: +91 988 699 3746 >>>> >>>> skype: sourav.chandra >>>> >>>> Livestream >>>> >>>> "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd >>>> Block, Koramangala Industrial Area, >>>> >>>> Bangalore 560034 >>>> >>>> www.livestream.com >>>> >>> >>> >> >