[ 
https://issues.apache.org/jira/browse/DRILL-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023986#comment-16023986
 ] 

ASF GitHub Bot commented on DRILL-5432:
---------------------------------------

Github user tdunning commented on a diff in the pull request:

    https://github.com/apache/drill/pull/831#discussion_r118399563
  
    --- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/pcap/TestPcapDecoder.java
 ---
    @@ -0,0 +1,230 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to you under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + * <p>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.store.pcap;
    +
    +import com.google.common.io.Resources;
    +import org.apache.drill.BaseTestQuery;
    +import org.apache.drill.exec.store.pcap.decoder.Packet;
    +import org.apache.drill.exec.store.pcap.decoder.PacketDecoder;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import java.io.BufferedInputStream;
    +import java.io.DataOutputStream;
    +import java.io.File;
    +import java.io.FileInputStream;
    +import java.io.FileOutputStream;
    +import java.io.IOException;
    +import java.io.InputStream;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +
    +public class TestPcapDecoder extends BaseTestQuery {
    +  private static File bigFile;
    +
    +  /**
    +   * Creates an ephemeral file of about a GB in size
    +   *
    +   * @throws IOException If input file can't be read or output can't be 
written.
    +   */
    +  @BeforeClass
    +  public static void buildBigTcpFile() throws IOException {
    +    bigFile = File.createTempFile("tcp", ".pcap");
    +    bigFile.deleteOnExit();
    +    boolean first = true;
    +    System.out.printf("Building large test file\n");
    +    try (DataOutputStream out = new DataOutputStream(new 
FileOutputStream(bigFile))) {
    +      for (int i = 0; i < 1000e6 / (29208 - 24) + 1; i++) {
    +        // might be faster to keep this open and rewind each time, but
    +        // that is hard to do with a resource, especially if it comes
    +        // from the class path instead of files.
    +        try (InputStream in = 
Resources.getResource("store/pcap/tcp-2.pcap").openStream()) {
    +          ConcatPcap.copy(first, in, out);
    +        }
    +        first = false;
    +      }
    +      System.out.printf("Created file is %.1f MB\n", bigFile.length() / 
1e6);
    --- End diff --
    
    I changed those methods to be called from a public static void main(). That 
allows them to be used to get information about speeds, but doesn't include 
their output in the test.
    
    I think that addresses this comment.


> Want a memory format for PCAP files
> -----------------------------------
>
>                 Key: DRILL-5432
>                 URL: https://issues.apache.org/jira/browse/DRILL-5432
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Ted Dunning
>
> PCAP files [1] are the de facto standard for storing network capture data. In 
> security and protocol applications, it is very common to want to extract 
> particular packets from a capture for further analysis.
> At a first level, it is desirable to query and filter by source and 
> destination IP and port or by protocol. Beyond that, however, it would be 
> very useful to be able to group packets by TCP session and eventually to look 
> at packet contents. For now, however, the most critical requirement is that 
> we should be able to scan captures at very high speed.
> I previously wrote a (kind of working) proof of concept for a PCAP decoder 
> that did lazy deserialization and could traverse hundreds of MB of PCAP data 
> per second per core. This compares to roughly 2-3 MB/s for widely available 
> Apache-compatible open source PCAP decoders.
> This JIRA covers the integration and extension of that proof of concept as a 
> Drill file format.
> Initial work is available at https://github.com/mapr-demos/drill-pcap-format
> [1] https://en.wikipedia.org/wiki/Pcap



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to