[GitHub] metron pull request #622: METRON-1005 Create Decodable Row Key for Profiler

2017-07-20 Thread mattf-horton
Github user mattf-horton commented on a diff in the pull request:

https://github.com/apache/metron/pull/622#discussion_r128650596
  
--- Diff: 
metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/SaltyRowKeyBuilder.java
 ---
@@ -81,20 +99,19 @@ public SaltyRowKeyBuilder(int saltDivisor, long 
duration, TimeUnit units) {
* @return All of the row keys necessary to retrieve the profile 
measurements.
*/
   @Override
-  public List rowKeys(String profile, String entity, List 
groups, long start, long end) {
+  public List encode(String profile, String entity, List 
groups, long start, long end) {
 // be forgiving of out-of-order start and end times; order is critical 
to this algorithm
 end = Math.max(start, end);
 start = Math.min(start, end);
--- End diff --

Heh, this has been in the code for a long time, but isn't this a bug?  If 
it starts out in the wrong order, say end is 1 and start is 5, won't this pair 
of statements result in both end and start being equal to the larger, ie 5 ?  
We need an intermediate variable for a binary swap!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #577: METRON-746: Build Custom Checkstyle and IDE formatting se...

2017-07-20 Thread justinleet
Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/577
  
As a note, I moved the checkstyle version up to 8 and will update the 
instructions.

@mmiklavc This'll be relevant if you look through anything.  I don't know 
that it'll break on 7.7, because there weren't major changes to the file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron pull request #654: METRON-1044: Disabled writers are not acking messa...

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/654


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #577: METRON-746: Build Custom Checkstyle and IDE formatting se...

2017-07-20 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/577
  
Hey @justinleet, apologies for this slipping between the cracks. I'll run 
this again and see if there's anything specific I can find.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler

2017-07-20 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/622
  
I want to point out that I am also in favor of an audit log for the 
profiler, but I don't think it's a complete solution for the batch analytics 
use-case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler

2017-07-20 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/622
  
Also, while we're in here, is there a strong reason why the prefixed hash 
is so large?  It's just there for uniformity of distribution, correct?  I'd 
propose a non-cryptographic hash for this purpose like Murmur.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler

2017-07-20 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/622
  
So, in my mind the feature here is the enablement of batch analytics on the 
profiles.  To that end, I'm in general in favor of a decodable row key. I think 
that the question really isn't a ToC *or* a decodable rowkey.  I think, rather, 
we will want both.  The two will follow different access patterns.  A decodable 
rowkey sans ToC will be suitable only for full table scan-style access.  A ToC 
would enable to slice or dice by profile/entity/etc.  

That being said, a ToC without a decodable rowkey is substantially less 
nice.  Without being able to decode the rowkey, we will not be able to 
regenerate the ToC to provide alternative indexing.  I see this as a first step 
to enable a broader discussion on just what kind of access semantics beyond 
Get/Put we want to place on the profiles.

All that to say, I'm in favor of the effort.  I worry at the impact going 
forward to existing profiles, though.  From the point where we do this, we will 
create a fork whereby new profiles and old profiles diverge.  I think we need 
to discuss the migration story more explicitly and see if it is plausible to 
create a migration tool that is fuzzy (i.e. will look at the existing profiles 
and try to pick them apart).

I'd be ok for that work to be a follow-on, but I would want the plan to be 
very explicit and I would be -1 for a release until it's in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler

2017-07-20 Thread mattf-horton
Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/622
  
Let me take a look at this more deeply.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #577: METRON-746: Build Custom Checkstyle and IDE formatting se...

2017-07-20 Thread justinleet
Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/577
  
@mmiklavc Any response to the above? I'm also happy to take a look through 
what you set up and see if there's anything missing in the instructions and so 
on, if that would help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #656: METRON-1050 Improve Docs of 'profile.period.duration'

2017-07-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/656
  
Thanks @mattf-horton !  That fixed the links.  Everything looks good to me. 
 Please take a look guys.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler

2017-07-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/622
  
>  I don't think our current row key is totally opaque, it just needs a 
brute-force approach to figure out. Not suitable for interactive queries, but 
would be acceptable for a one-time pass to build (or re-build) the ToC.

For reference, here is what the existing row key looks-like.

 salt (16B) + profile name (?) + entity name (?) + groups (?) + time (8B)

How would you decode it?  The salt and the time components have known 
lengths; 16B and 8B respectively.  Other than those two components, I don't 
know how to distinguish the profile name, entity or groups.  I can only decode 
the row key if I already know either the profile name or the entity, which 
defeats the advantages of being able to decode it.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #614: METRON-992: Create performance tuning guide

2017-07-20 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/614
  
+1 by inspection; this is right on the money and a good first pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #654: METRON-1044: Disabled writers are not acking messages

2017-07-20 Thread justinleet
Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/654
  
+1 by inspection, thaks for the test case update.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #656: METRON-1050 Improve Docs of 'profile.period.duration'

2017-07-20 Thread mattf-horton
Github user mattf-horton commented on the issue:

https://github.com/apache/metron/pull/656
  
@nickwallen , you found a Github/Doxia difference not previously noticed!  
Please apply the following patch to site-book/bin/fix-md-dialect.py, and commit 
it with your docs improvements :
```diff
diff --git a/site-book/bin/fix-md-dialect.py 
b/site-book/bin/fix-md-dialect.py
index 5e6db3e..02be2fb 100755
--- a/site-book/bin/fix-md-dialect.py
+++ b/site-book/bin/fix-md-dialect.py
@@ -59,7 +59,8 @@ import inspect
 import re
 
 # These are the characters excluded by Markdown from use in auto-generated 
anchor text for Headings.
-EXCLUDED_CHARS_REGEX = r'[^\w\-]'   # all non-alphanumerics except "-" and 
"_".  Whitespace are previously converted.
+EXCLUDED_CHARS_REGEX_GHM = r'[^\w\-]'   # all non-alphanumerics except "-" 
and "_".  Whitespace are previously converted.
+EXCLUDED_CHARS_REGEX_DOX = r'[^\w\.\-]'   # all non-alphanumerics except 
"-", "_", and ".".  Whitespace are previously converted.
 
 def report_error(s) :
 print >>sys.stderr, "ERROR: " + s 
@@ -242,12 +243,12 @@ def rewrite_relative_links() :
 trace('labeltext = "' + labeltext + '"')
 scratch = labeltext.lower()  # Github-MD 
forces all anchors to lowercase
 scratch = re.sub(r'[\s]', "-", scratch)  # convert 
whitespace to "-"
-scratch = re.sub(EXCLUDED_CHARS_REGEX, "", scratch)  # strip 
non-alphanumerics
+scratch = re.sub(EXCLUDED_CHARS_REGEX_GHM, "", scratch)  # 
strip non-alphanumerics
 if (scratch == named_anchor) :
 trace("Found a rewritable case")
 scratch = labeltext  # Doxia-markdown 
doesn't change case
 scratch = re.sub(r'[\s]', "_", scratch)  # convert 
whitespace to "_"
-scratch = re.sub(EXCLUDED_CHARS_REGEX, "", scratch)  # 
strip non-alphanumerics
+scratch = re.sub(EXCLUDED_CHARS_REGEX_DOX, "", scratch)  # 
strip non-alphanumerics except "."
 href = re.sub("#" + named_anchor, "#" + scratch, href)
 
 trace("After anchor rewrite, href is: " + href)
@@ -372,9 +373,9 @@ for FILENAME in sys.argv[1:] :
 active_type = "none"
 indent_stack.init_indent()
 if re.search(r'^#[^#]', inputline) :
-# First-level headers ("H1") need explicit anchor 
inserted.  This fixes problem #6.
+# First-level headers ("H1") need explicit anchor inserted 
(Doxia style).  This fixes problem #6.
 anchor_name = re.sub(r' ', "_", inputline[1:].strip())
-anchor_name = re.sub(EXCLUDED_CHARS_REGEX, "", anchor_name)
+anchor_name = re.sub(EXCLUDED_CHARS_REGEX_DOX, "", 
anchor_name)
 anchor_text = ''
 if H1_COUNT == 0 :
 # Treat the first header differently - put the header 
after instead of before
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #580: METRON-942 [NO MERGE UNTIL METRON-777] Rest api and confi...

2017-07-20 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/580
  
I have changed the PR description and steps to reflect that uninstalling 
the extension no longer stops the kafka and storm jobs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #481: METRON-322 Global Batching and Flushing

2017-07-20 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/metron/pull/481
  
My test environment is being funky, I'm +1 by inspection- thanks for the 
work, please don't let me hold you up any more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron issue #656: METRON-1050 Improve Docs of 'profile.period.duration'

2017-07-20 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/656
  
Thanks @mattf-horton .  That works for some of the links.  It does not seem 
to work for links with a period in them.

Based on Github's rendering, for names with periods I have to drop periods 
from the link.  Is there a way I can get this to work with both our site book 
and Github's rendering?
```
[`profiler.input.topic`](#profilerinputtopic)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] metron pull request #636: METRON-1022: Elasticsearch REST endpoint

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/636


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---