Are you looking for something like filter? See a similar example here
https://spark.apache.org/examples.html

Thanks
Best Regards

On Sat, Jun 13, 2015 at 3:11 PM, Hao Wang <bill...@gmail.com> wrote:

> Hi,
>
> I have a bunch of large log files on Hadoop. Each line contains a log and
> its severity. Is there a way that I can use Spark to split the entire data
> set into different files on Hadoop according the severity field? Thanks.
> Below is an example of the input and output.
>
> Input:
> [ERROR] log1
> [INFO] log2
> [ERROR] log3
> [INFO] log4
>
> Output:
> error_file
> [ERROR] log1
> [ERROR] log3
>
> info_file
> [INFO] log2
> [INFO] log4
>
>
> Best,
> Hao Wang
>

Reply via email to