hello guys: I have some transactional data as attached file 1.txt. A sequence of a single operation 1 followed by a few operations 0 is a transation here. The transcations, which sum(Amount) of operation 0 is less than the sum(Amount) of operation 1, need to be found out. There are serveral questions here:1. To deal with this kind of transaction, What is the most sensible way?Does UDAF help? Or does sparksql provide transactional support? I remembered that hive has some kind of support towards transaction, like https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-GrammarChanges. 2.The data has been sorted by timestamp. How about to get those transactions with a time period ,like 24hours. thank you.
-------------------------------- Thanks&Best regards! San.Luo
|Account|Operation| Timestamp| Amount| +-------+---------+----------+-------+ | 13| 1|1400017208| 674.33| | 13| 0|1400102650| 73.86| | 13| 1|1400130576|1155.48| | 13| 1|1400165378| 96.04| | 13| 0|1400245724| 173.84| | 13| 0|1400258007| 852.29| | 13| 1|1400265065|2085.32| | 13| 0|1400329127| 429.3| | 13| 0|1400383007| 611.2| | 13| 1|1400428342|1629.76| | 13| 0|1400457645| 490.55| | 13| 1|1400516552| 369.54| | 13| 1|1400618678|1316.05| | 13| 0|1400655615| 573.71| | 13| 0|1400696930| 877.16| | 13| 0|1400732011| 105.51| | 13| 0|1400751612|1512.23| | 13| 0|1400761888| 414.36| | 13| 0|1400814042| 36.52| | 13| 0|1400831895| 611.15| +-------+---------+----------+-------+ only showing top 20 rows SQL£ºselect * from r where Account=13 limit 20
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org