Re: Generic Dataset[T] Query

2019-05-09 Thread Ramandeep Singh Nanda
You need to supply a rowencoder. Regards, Ramandeep Singh On Thu, May 9, 2019, 11:33 SNEHASISH DUTTA wrote: > Hi , > > I am trying to write a generic method which will return custom type > datasets as well as spark.sql.Row > > def read[T](params: Map[String, Any])(implicit encoder:

Re: Re: How to get all input tables of a SPARK SQL 'select' statement

2019-01-25 Thread Ramandeep Singh Nanda
epts 'SELECT * FROM FOO', but it doesn't accept 'select * from > foo'. > > But I can run the spark.sql("select * from foo") in the spark2-shell > without any problem. > > Is there another 'layer' in the SPARK SQL to capitalize those 'tokens' > before invoking the parser?

Re: How to get all input tables of a SPARK SQL 'select' statement

2019-01-23 Thread Ramandeep Singh Nanda
Explain extended or explain would list the plan along with the tables. Not aware of any statements that explicitly list dependencies or tables directly. Regards, Ramandeep Singh On Wed, Jan 23, 2019, 11:05 Tomas Bartalos This might help: > > show tables; > > st 23. 1. 2019 o 10:43 napĂ­sal(a):

Re: Is it possible to rate limit an UDP?

2019-01-14 Thread Ramandeep Singh Nanda
Basically, it is a zipping two flowables using the defined function[takes two parameters and returns one, Hence the name BiFunction]. Obviously, you could avoid using RXJava and by using a TimerTask. val a = Seq(1, 2, 3) val b = a.zipWithIndex b.foreach(b => new Timer().schedule(new TimerTask {

Re: Need help with SparkSQL Query

2018-12-17 Thread Ramandeep Singh Nanda
You can use analytical functions in spark sql. Something like select * from (select id, row_number() over (partition by id order by timestamp ) as rn from root) where rn=1 On Mon, Dec 17, 2018 at 4:03 PM Nikhil Goyal wrote: > Hi guys, > > I have a dataframe of type Record (id: Long, timestamp:

Is Dataframe write blocking?

2018-11-08 Thread Ramandeep Singh Nanda
HI, I have some futures setup to operate in stages, where I expect one stage to complete before another begins. I was hoping that dataframe write call is blocking, whereas the behavior i see is that the call returns before data is persisted. This can cause unintended consequences. I am also using