Re: Access Last Element of RDD

2014-04-24 Thread Sourav Chandra
You can use rdd.takeOrdered(1)(reverseOrdrering)

reverseOrdering is you Ordering[T] instance where you define the ordering
logic. This you have to pass in the method



On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft 
fnoth...@berkeley.edu wrote:

 If you do this, you could simplify to:

 RDD.collect().last

 However, this has the problem of collecting all data to the driver.

 Is your data sorted? If so, you could reverse the sort and take the first.
 Alternatively, a hackey implementation might involve a
 mapPartitionsWithIndex that returns an empty iterator for all partitions
 except for the last. For the last partition, you would filter all elements
 except for the last element in your iterator. This should leave one
 element, which is your last element.

 Frank Austin Nothaft
 fnoth...@berkeley.edu
 fnoth...@eecs.berkeley.edu
 202-340-0466

 On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 This function will return scala List, you can use List's last function to
 get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!









-- 

Sourav Chandra

Senior Software Engineer

· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

sourav.chan...@livestream.com

o: +91 80 4121 8723

m: +91 988 699 3746

skype: sourav.chandra

Livestream

Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
Block, Koramangala Industrial Area,

Bangalore 560034

www.livestream.com


Re: Access Last Element of RDD

2014-04-24 Thread Sourav Chandra
Also same thing can be done using rdd.top(1)(reverseOrdering)



On Thu, Apr 24, 2014 at 11:28 AM, Sourav Chandra 
sourav.chan...@livestream.com wrote:

 You can use rdd.takeOrdered(1)(reverseOrdrering)

 reverseOrdering is you Ordering[T] instance where you define the ordering
 logic. This you have to pass in the method



 On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft 
 fnoth...@berkeley.edu wrote:

 If you do this, you could simplify to:

 RDD.collect().last

 However, this has the problem of collecting all data to the driver.

 Is your data sorted? If so, you could reverse the sort and take the
 first. Alternatively, a hackey implementation might involve a
 mapPartitionsWithIndex that returns an empty iterator for all partitions
 except for the last. For the last partition, you would filter all elements
 except for the last element in your iterator. This should leave one
 element, which is your last element.

 Frank Austin Nothaft
 fnoth...@berkeley.edu
 fnoth...@eecs.berkeley.edu
 202-340-0466

 On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 This function will return scala List, you can use List's last function to
 get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna ansaiprasa...@gmail.com
  wrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!









 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com




-- 

Sourav Chandra

Senior Software Engineer

· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

sourav.chan...@livestream.com

o: +91 80 4121 8723

m: +91 988 699 3746

skype: sourav.chandra

Livestream

Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
Block, Koramangala Industrial Area,

Bangalore 560034

www.livestream.com


Re: Access Last Element of RDD

2014-04-24 Thread Sai Prasanna
Thanks Guys !


On Thu, Apr 24, 2014 at 11:29 AM, Sourav Chandra 
sourav.chan...@livestream.com wrote:

 Also same thing can be done using rdd.top(1)(reverseOrdering)



 On Thu, Apr 24, 2014 at 11:28 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 You can use rdd.takeOrdered(1)(reverseOrdrering)

 reverseOrdering is you Ordering[T] instance where you define the ordering
 logic. This you have to pass in the method



 On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft 
 fnoth...@berkeley.edu wrote:

 If you do this, you could simplify to:

 RDD.collect().last

 However, this has the problem of collecting all data to the driver.

 Is your data sorted? If so, you could reverse the sort and take the
 first. Alternatively, a hackey implementation might involve a
 mapPartitionsWithIndex that returns an empty iterator for all partitions
 except for the last. For the last partition, you would filter all elements
 except for the last element in your iterator. This should leave one
 element, which is your last element.

 Frank Austin Nothaft
 fnoth...@berkeley.edu
 fnoth...@eecs.berkeley.edu
 202-340-0466

 On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 This function will return scala List, you can use List's last function
 to get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna ansaiprasa...@gmail.com
  wrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!









 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com




 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com



Re: Access Last Element of RDD

2014-04-24 Thread Sai Prasanna
Hi All, Finally i wrote the following code, which is felt does optimally if
not the most optimum one.
Using file pointers, seeking the byte after the last \n but backwards !!
This is memory efficient and i hope even unix tail implementation should be
something similar !!

import java.io.RandomAccessFile
import java.io.IOException
var FILEPATH=/home/sparkcluster/hadoop-2.3.0/temp;
var fileHandler = new RandomAccessFile( FILEPATH, r );
var fileLength = fileHandler.length() - 1;
var cond = 1;
var filePointer = fileLength-1;
var toRead= -1;
while(filePointer != -1  cond!=0){
 fileHandler.seek( filePointer );
 var readByte = fileHandler.readByte();
 if( readByte == 0xA  filePointer != fileLength ) cond=0;
 else if( readByte == 0xD  filePointer != fileLength - 1
) cond=0;

 filePointer=filePointer-1; toRead=toRead+1;
}
filePointer=filePointer+2;
var bytes : Array[Byte] = new Array[Byte](toRead);
fileHandler.seek(filePointer);
fileHandler.read(bytes);
var bdd=new String(bytes);  /*bdd contains the last line*/




On Thu, Apr 24, 2014 at 11:42 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Thanks Guys !


 On Thu, Apr 24, 2014 at 11:29 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 Also same thing can be done using rdd.top(1)(reverseOrdering)



 On Thu, Apr 24, 2014 at 11:28 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 You can use rdd.takeOrdered(1)(reverseOrdrering)

 reverseOrdering is you Ordering[T] instance where you define the
 ordering logic. This you have to pass in the method



 On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft 
 fnoth...@berkeley.edu wrote:

 If you do this, you could simplify to:

 RDD.collect().last

 However, this has the problem of collecting all data to the driver.

 Is your data sorted? If so, you could reverse the sort and take the
 first. Alternatively, a hackey implementation might involve a
 mapPartitionsWithIndex that returns an empty iterator for all partitions
 except for the last. For the last partition, you would filter all elements
 except for the last element in your iterator. This should leave one
 element, which is your last element.

 Frank Austin Nothaft
 fnoth...@berkeley.edu
 fnoth...@eecs.berkeley.edu
 202-340-0466

 On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 This function will return scala List, you can use List's last function
 to get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna ansaiprasa...@gmail.com
  wrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!









 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com




 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com





Re: Access Last Element of RDD

2014-04-24 Thread Cheng Lian
You may try this:

val lastOption = sc.textFile(input).mapPartitions { iterator =
  if (iterator.isEmpty) {
iterator
  } else {
Iterator
  .continually((iterator.next(), iterator.hasNext()))
  .collect { case (value, false) = value }
  .take(1)
  }
}.collect().lastOption

Iterator based data access ensures O(1) space complexity and it runs faster
because different partitions are processed in parallel. lastOption is used
instead of last to deal with empty file.


On Thu, Apr 24, 2014 at 7:38 PM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Hi All, Finally i wrote the following code, which is felt does optimally
 if not the most optimum one.
 Using file pointers, seeking the byte after the last \n but backwards !!
 This is memory efficient and i hope even unix tail implementation should
 be something similar !!

 import java.io.RandomAccessFile
 import java.io.IOException
 var FILEPATH=/home/sparkcluster/hadoop-2.3.0/temp;
 var fileHandler = new RandomAccessFile( FILEPATH, r );
 var fileLength = fileHandler.length() - 1;
 var cond = 1;
 var filePointer = fileLength-1;
 var toRead= -1;
 while(filePointer != -1  cond!=0){
  fileHandler.seek( filePointer );
  var readByte = fileHandler.readByte();
  if( readByte == 0xA  filePointer != fileLength )
 cond=0;
   else if( readByte == 0xD  filePointer != fileLength -
 1 ) cond=0;

  filePointer=filePointer-1; toRead=toRead+1;
 }
 filePointer=filePointer+2;
 var bytes : Array[Byte] = new Array[Byte](toRead);
 fileHandler.seek(filePointer);
 fileHandler.read(bytes);
 var bdd=new String(bytes);  /*bdd contains the last line*/




 On Thu, Apr 24, 2014 at 11:42 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Thanks Guys !


 On Thu, Apr 24, 2014 at 11:29 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 Also same thing can be done using rdd.top(1)(reverseOrdering)



 On Thu, Apr 24, 2014 at 11:28 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 You can use rdd.takeOrdered(1)(reverseOrdrering)

 reverseOrdering is you Ordering[T] instance where you define the
 ordering logic. This you have to pass in the method



 On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft 
 fnoth...@berkeley.edu wrote:

 If you do this, you could simplify to:

 RDD.collect().last

 However, this has the problem of collecting all data to the driver.

 Is your data sorted? If so, you could reverse the sort and take the
 first. Alternatively, a hackey implementation might involve a
 mapPartitionsWithIndex that returns an empty iterator for all partitions
 except for the last. For the last partition, you would filter all elements
 except for the last element in your iterator. This should leave one
 element, which is your last element.

 Frank Austin Nothaft
 fnoth...@berkeley.edu
 fnoth...@eecs.berkeley.edu
 202-340-0466

 On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 This function will return scala List, you can use List's last function
 to get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob 
 nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!









 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com




 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com






Re: Access Last Element of RDD

2014-04-24 Thread Sai Prasanna
Thanks Cheng !!


On Thu, Apr 24, 2014 at 5:43 PM, Cheng Lian lian.cs@gmail.com wrote:

 You may try this:

 val lastOption = sc.textFile(input).mapPartitions { iterator =
   if (iterator.isEmpty) {
 iterator
   } else {
 Iterator
   .continually((iterator.next(), iterator.hasNext()))
   .collect { case (value, false) = value }
   .take(1)
   }
 }.collect().lastOption

 Iterator based data access ensures O(1) space complexity and it runs
 faster because different partitions are processed in parallel. lastOptionis 
 used instead of
 last to deal with empty file.


 On Thu, Apr 24, 2014 at 7:38 PM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Hi All, Finally i wrote the following code, which is felt does optimally
 if not the most optimum one.
 Using file pointers, seeking the byte after the last \n but backwards !!
 This is memory efficient and i hope even unix tail implementation should
 be something similar !!

 import java.io.RandomAccessFile
 import java.io.IOException
 var FILEPATH=/home/sparkcluster/hadoop-2.3.0/temp;
 var fileHandler = new RandomAccessFile( FILEPATH, r );
 var fileLength = fileHandler.length() - 1;
 var cond = 1;
 var filePointer = fileLength-1;
 var toRead= -1;
 while(filePointer != -1  cond!=0){
  fileHandler.seek( filePointer );
  var readByte = fileHandler.readByte();
  if( readByte == 0xA  filePointer != fileLength )
 cond=0;
   else if( readByte == 0xD  filePointer != fileLength
 - 1 ) cond=0;

  filePointer=filePointer-1; toRead=toRead+1;
 }
 filePointer=filePointer+2;
 var bytes : Array[Byte] = new Array[Byte](toRead);
 fileHandler.seek(filePointer);
 fileHandler.read(bytes);
 var bdd=new String(bytes);  /*bdd contains the last line*/




 On Thu, Apr 24, 2014 at 11:42 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Thanks Guys !


 On Thu, Apr 24, 2014 at 11:29 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 Also same thing can be done using rdd.top(1)(reverseOrdering)



 On Thu, Apr 24, 2014 at 11:28 AM, Sourav Chandra 
 sourav.chan...@livestream.com wrote:

 You can use rdd.takeOrdered(1)(reverseOrdrering)

 reverseOrdering is you Ordering[T] instance where you define the
 ordering logic. This you have to pass in the method



 On Thu, Apr 24, 2014 at 11:21 AM, Frank Austin Nothaft 
 fnoth...@berkeley.edu wrote:

 If you do this, you could simplify to:

 RDD.collect().last

 However, this has the problem of collecting all data to the driver.

 Is your data sorted? If so, you could reverse the sort and take the
 first. Alternatively, a hackey implementation might involve a
 mapPartitionsWithIndex that returns an empty iterator for all partitions
 except for the last. For the last partition, you would filter all 
 elements
 except for the last element in your iterator. This should leave one
 element, which is your last element.

 Frank Austin Nothaft
 fnoth...@berkeley.edu
 fnoth...@eecs.berkeley.edu
 202-340-0466

 On Apr 23, 2014, at 10:44 PM, Adnan Yaqoob nsyaq...@gmail.com
 wrote:

 This function will return scala List, you can use List's last
 function to get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob 
 nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.com wrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a
 straight forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!









 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main,
 3rd Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com




 --

 Sourav Chandra

 Senior Software Engineer

 · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

 sourav.chan...@livestream.com

 o: +91 80 4121 8723

 m: +91 988 699 3746

 skype: sourav.chandra

 Livestream

 Ajmera Summit, First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
 Block, Koramangala Industrial Area,

 Bangalore 560034

 www.livestream.com







Access Last Element of RDD

2014-04-23 Thread Sai Prasanna
Hi All, Some help !
RDD.first or RDD.take(1) gives the first item, is there a straight forward
way to access the last element in a similar way ?

I coudnt fine a tail/last method for RDD. !!


Re: Access Last Element of RDD

2014-04-23 Thread Adnan Yaqoob
You can use following code:

RDD.take(RDD.count())


On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight forward
 way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!



Re: Access Last Element of RDD

2014-04-23 Thread Sai Prasanna
Oh ya, Thanks Adnan.


On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!





Re: Access Last Element of RDD

2014-04-23 Thread Sai Prasanna
Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

I want only to access the last element.


On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!






Re: Access Last Element of RDD

2014-04-23 Thread Adnan Yaqoob
This function will return scala List, you can use List's last function to
get the last element.

For example:

RDD.take(RDD.count()).last


On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!







Re: Access Last Element of RDD

2014-04-23 Thread Sai Prasanna
What i observe is, this way of computing is very inefficient. It returns
all the elements of the RDD to a List which takes considerable amount of
time.
Then it calculates the last element.

I have a file of size 3 GB in which i ran a lot of aggregate operations
which dint took the time that this take(RDD.count) took.

Is there an efficient way ? My guess is there should be one, since its a
basic operation.


On Thu, Apr 24, 2014 at 11:14 AM, Adnan Yaqoob nsyaq...@gmail.com wrote:

 This function will return scala List, you can use List's last function to
 get the last element.

 For example:

 RDD.take(RDD.count()).last


 On Thu, Apr 24, 2014 at 10:28 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:

 Adnan, but RDD.take(RDD.count()) returns all the elements of the RDD.

 I want only to access the last element.


 On Thu, Apr 24, 2014 at 10:33 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Oh ya, Thanks Adnan.


 On Thu, Apr 24, 2014 at 10:30 AM, Adnan Yaqoob nsyaq...@gmail.comwrote:

 You can use following code:

 RDD.take(RDD.count())


 On Thu, Apr 24, 2014 at 9:51 AM, Sai Prasanna 
 ansaiprasa...@gmail.comwrote:

 Hi All, Some help !
 RDD.first or RDD.take(1) gives the first item, is there a straight
 forward way to access the last element in a similar way ?

 I coudnt fine a tail/last method for RDD. !!