Hi all, We are new to spark and other related things like Hadoop and HDFS. Spark documentations and specifications seem to say that it can be an intersecting tool for real time and batch video analytics. In our case analytics means : - detecting feature inside video stream (object, face, ...) - extract feature corresponding to a given metadata (ex third parties object bounding box) - tracking these features across video frames - learning from these features (clustering, classification, ...) Nowadays there are many work that use map reduce pattern to do these computations on a large video data and we like to use spark for that because of its promising performance. The difficult part is, how to get started ? It is not clear how can we transform a video live stream (H264 encoded, rtsp protocol) and video files into a RDD that spark can consume ? How should we store the video files for batch processing ? Currently these files are on a streaming server. Is it easy to extend the sliding window for spark streaming in 3D (space + time) : a 2D sliding window inside each frame for feature detection and the time sliding widow as usual.
Any tips and suggestion will be helpful, Best regards, Jaonary