[GitHub] metron pull request #622: METRON-1005 Create Decodable Row Key for Profiler
Github user mattf-horton commented on a diff in the pull request: https://github.com/apache/metron/pull/622#discussion_r128650596 --- Diff: metron-analytics/metron-profiler-common/src/main/java/org/apache/metron/profiler/hbase/SaltyRowKeyBuilder.java --- @@ -81,20 +99,19 @@ public SaltyRowKeyBuilder(int saltDivisor, long duration, TimeUnit units) { * @return All of the row keys necessary to retrieve the profile measurements. */ @Override - public ListrowKeys(String profile, String entity, List groups, long start, long end) { + public List encode(String profile, String entity, List groups, long start, long end) { // be forgiving of out-of-order start and end times; order is critical to this algorithm end = Math.max(start, end); start = Math.min(start, end); --- End diff -- Heh, this has been in the code for a long time, but isn't this a bug? If it starts out in the wrong order, say end is 1 and start is 5, won't this pair of statements result in both end and start being equal to the larger, ie 5 ? We need an intermediate variable for a binary swap! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #577: METRON-746: Build Custom Checkstyle and IDE formatting se...
Github user justinleet commented on the issue: https://github.com/apache/metron/pull/577 As a note, I moved the checkstyle version up to 8 and will update the instructions. @mmiklavc This'll be relevant if you look through anything. I don't know that it'll break on 7.7, because there weren't major changes to the file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #654: METRON-1044: Disabled writers are not acking messa...
Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/654 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #577: METRON-746: Build Custom Checkstyle and IDE formatting se...
Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/577 Hey @justinleet, apologies for this slipping between the cracks. I'll run this again and see if there's anything specific I can find. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler
Github user cestella commented on the issue: https://github.com/apache/metron/pull/622 I want to point out that I am also in favor of an audit log for the profiler, but I don't think it's a complete solution for the batch analytics use-case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler
Github user cestella commented on the issue: https://github.com/apache/metron/pull/622 Also, while we're in here, is there a strong reason why the prefixed hash is so large? It's just there for uniformity of distribution, correct? I'd propose a non-cryptographic hash for this purpose like Murmur. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler
Github user cestella commented on the issue: https://github.com/apache/metron/pull/622 So, in my mind the feature here is the enablement of batch analytics on the profiles. To that end, I'm in general in favor of a decodable row key. I think that the question really isn't a ToC *or* a decodable rowkey. I think, rather, we will want both. The two will follow different access patterns. A decodable rowkey sans ToC will be suitable only for full table scan-style access. A ToC would enable to slice or dice by profile/entity/etc. That being said, a ToC without a decodable rowkey is substantially less nice. Without being able to decode the rowkey, we will not be able to regenerate the ToC to provide alternative indexing. I see this as a first step to enable a broader discussion on just what kind of access semantics beyond Get/Put we want to place on the profiles. All that to say, I'm in favor of the effort. I worry at the impact going forward to existing profiles, though. From the point where we do this, we will create a fork whereby new profiles and old profiles diverge. I think we need to discuss the migration story more explicitly and see if it is plausible to create a migration tool that is fuzzy (i.e. will look at the existing profiles and try to pick them apart). I'd be ok for that work to be a follow-on, but I would want the plan to be very explicit and I would be -1 for a release until it's in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler
Github user mattf-horton commented on the issue: https://github.com/apache/metron/pull/622 Let me take a look at this more deeply. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #577: METRON-746: Build Custom Checkstyle and IDE formatting se...
Github user justinleet commented on the issue: https://github.com/apache/metron/pull/577 @mmiklavc Any response to the above? I'm also happy to take a look through what you set up and see if there's anything missing in the instructions and so on, if that would help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #656: METRON-1050 Improve Docs of 'profile.period.duration'
Github user nickwallen commented on the issue: https://github.com/apache/metron/pull/656 Thanks @mattf-horton ! That fixed the links. Everything looks good to me. Please take a look guys. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #622: METRON-1005 Create Decodable Row Key for Profiler
Github user nickwallen commented on the issue: https://github.com/apache/metron/pull/622 > I don't think our current row key is totally opaque, it just needs a brute-force approach to figure out. Not suitable for interactive queries, but would be acceptable for a one-time pass to build (or re-build) the ToC. For reference, here is what the existing row key looks-like. salt (16B) + profile name (?) + entity name (?) + groups (?) + time (8B) How would you decode it? The salt and the time components have known lengths; 16B and 8B respectively. Other than those two components, I don't know how to distinguish the profile name, entity or groups. I can only decode the row key if I already know either the profile name or the entity, which defeats the advantages of being able to decode it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #614: METRON-992: Create performance tuning guide
Github user cestella commented on the issue: https://github.com/apache/metron/pull/614 +1 by inspection; this is right on the money and a good first pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #654: METRON-1044: Disabled writers are not acking messages
Github user justinleet commented on the issue: https://github.com/apache/metron/pull/654 +1 by inspection, thaks for the test case update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #656: METRON-1050 Improve Docs of 'profile.period.duration'
Github user mattf-horton commented on the issue: https://github.com/apache/metron/pull/656 @nickwallen , you found a Github/Doxia difference not previously noticed! Please apply the following patch to site-book/bin/fix-md-dialect.py, and commit it with your docs improvements : ```diff diff --git a/site-book/bin/fix-md-dialect.py b/site-book/bin/fix-md-dialect.py index 5e6db3e..02be2fb 100755 --- a/site-book/bin/fix-md-dialect.py +++ b/site-book/bin/fix-md-dialect.py @@ -59,7 +59,8 @@ import inspect import re # These are the characters excluded by Markdown from use in auto-generated anchor text for Headings. -EXCLUDED_CHARS_REGEX = r'[^\w\-]' # all non-alphanumerics except "-" and "_". Whitespace are previously converted. +EXCLUDED_CHARS_REGEX_GHM = r'[^\w\-]' # all non-alphanumerics except "-" and "_". Whitespace are previously converted. +EXCLUDED_CHARS_REGEX_DOX = r'[^\w\.\-]' # all non-alphanumerics except "-", "_", and ".". Whitespace are previously converted. def report_error(s) : print >>sys.stderr, "ERROR: " + s @@ -242,12 +243,12 @@ def rewrite_relative_links() : trace('labeltext = "' + labeltext + '"') scratch = labeltext.lower() # Github-MD forces all anchors to lowercase scratch = re.sub(r'[\s]', "-", scratch) # convert whitespace to "-" -scratch = re.sub(EXCLUDED_CHARS_REGEX, "", scratch) # strip non-alphanumerics +scratch = re.sub(EXCLUDED_CHARS_REGEX_GHM, "", scratch) # strip non-alphanumerics if (scratch == named_anchor) : trace("Found a rewritable case") scratch = labeltext # Doxia-markdown doesn't change case scratch = re.sub(r'[\s]', "_", scratch) # convert whitespace to "_" -scratch = re.sub(EXCLUDED_CHARS_REGEX, "", scratch) # strip non-alphanumerics +scratch = re.sub(EXCLUDED_CHARS_REGEX_DOX, "", scratch) # strip non-alphanumerics except "." href = re.sub("#" + named_anchor, "#" + scratch, href) trace("After anchor rewrite, href is: " + href) @@ -372,9 +373,9 @@ for FILENAME in sys.argv[1:] : active_type = "none" indent_stack.init_indent() if re.search(r'^#[^#]', inputline) : -# First-level headers ("H1") need explicit anchor inserted. This fixes problem #6. +# First-level headers ("H1") need explicit anchor inserted (Doxia style). This fixes problem #6. anchor_name = re.sub(r' ', "_", inputline[1:].strip()) -anchor_name = re.sub(EXCLUDED_CHARS_REGEX, "", anchor_name) +anchor_name = re.sub(EXCLUDED_CHARS_REGEX_DOX, "", anchor_name) anchor_text = '' if H1_COUNT == 0 : # Treat the first header differently - put the header after instead of before ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #580: METRON-942 [NO MERGE UNTIL METRON-777] Rest api and confi...
Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/580 I have changed the PR description and steps to reflect that uninstalling the extension no longer stops the kafka and storm jobs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #481: METRON-322 Global Batching and Flushing
Github user dlyle65535 commented on the issue: https://github.com/apache/metron/pull/481 My test environment is being funky, I'm +1 by inspection- thanks for the work, please don't let me hold you up any more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron issue #656: METRON-1050 Improve Docs of 'profile.period.duration'
Github user nickwallen commented on the issue: https://github.com/apache/metron/pull/656 Thanks @mattf-horton . That works for some of the links. It does not seem to work for links with a period in them. Based on Github's rendering, for names with periods I have to drop periods from the link. Is there a way I can get this to work with both our site book and Github's rendering? ``` [`profiler.input.topic`](#profilerinputtopic) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] metron pull request #636: METRON-1022: Elasticsearch REST endpoint
Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/636 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---