samza git commit: Fix case-studies for LinkedIn, Optimizely, Tripadvisor, Slack. Re-word some of them.

jagadish Mon, 15 Oct 2018 16:47:03 -0700

Repository: samza
Updated Branches:
  refs/heads/master 988260a20 -> 7d3eb08b3



Fix case-studies for LinkedIn, Optimizely, Tripadvisor, Slack. Re-word some of 
them.

Author: Jagadish <jvenkatra...@linkedin.com>

Reviewers: Jagadish<jagad...@apache.org>

Closes #729 from vjagadish1989/website-reorg18


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/7d3eb08b
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/7d3eb08b
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/7d3eb08b

Branch: refs/heads/master
Commit: 7d3eb08b3e0ed025e15a7f70c50ff84e803e39d7
Parents: 988260a
Author: Jagadish <jvenkatra...@linkedin.com>
Authored: Mon Oct 15 16:46:33 2018 -0700
Committer: Jagadish <jvenkatra...@linkedin.com>
Committed: Mon Oct 15 16:46:33 2018 -0700

----------------------------------------------------------------------
 docs/_case-studies/ebay.md                      |  2 +-
 docs/_case-studies/linkedin.md                  | 23 +++++----
 docs/_case-studies/optimizely.md                | 53 ++++++++++----------
 docs/_case-studies/redfin.md                    | 40 +++++++--------
 docs/_case-studies/slack.md                     | 23 ++++-----
 docs/_case-studies/tripadvisor.md               | 29 +++++------
 docs/_powered-by/linkedin.md                    |  4 ++
 .../versioned/core-concepts/core-concepts.md    |  4 +-
 8 files changed, 90 insertions(+), 88 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_case-studies/ebay.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/ebay.md b/docs/_case-studies/ebay.md
index 96821f0..7156ce5 100644
--- a/docs/_case-studies/ebay.md
+++ b/docs/_case-studies/ebay.md
@@ -65,5 +65,5 @@ Key Samza features: *Stateful processing*, *Windowing*, 
*Kafka-integration*, *JM
 
 More information:
 
--   
[https://www.slideshare.net/edibice/extremely-low-latency-web-scale-fraud-prevention-with-apache-samza-kafka-and-friends](https://www.slideshare.net/edibice/extremely-low-latency-web-scale-fraud-prevention-with-apache-samza-kafka-and-friends)
+-   [Slides: Low latency Fraud prevention with Apache 
Samza](https://www.slideshare.net/edibice/extremely-low-latency-web-scale-fraud-prevention-with-apache-samza-kafka-and-friends)
 -   [http://ebayenterprise.com/](http://ebayenterprise.com/)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_case-studies/linkedin.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/linkedin.md b/docs/_case-studies/linkedin.md
index df62764..66e4108 100644
--- a/docs/_case-studies/linkedin.md
+++ b/docs/_case-studies/linkedin.md
@@ -25,25 +25,28 @@ excerpt_separator: <!--more-->
 -->
 
 How LinkedIn built Air Traffic Controller, a stateful stream processing system 
to optimize email communications?
+
 <!--more-->
 
-LinkedIn is a professional networking company that offers various services and 
platform for job seekers, employers and sales professionals. With a growing 
user base and multiple product offerings, it becomes imperative to streamline 
and standardize our communications to the users. In order to ensure member 
experience comes first before individual product metrics, LinkedIn developed a 
new email and notifications platform called *Air Traffic Controller*.
+LinkedIn is a professional networking company that offers various services and 
platform for job seekers, employers and sales professionals. With a growing 
user base and multiple product offerings, it becomes imperative to streamline 
communications to members. To ensure member experience comes first, LinkedIn 
developed a new email and notifications platform called *Air Traffic Controller 
(ATC)*.
 
-ATC is an intelligent platform, that is capable of tracking all the outgoing 
communications to the user and delivering the communication through the right 
channel to the right member at the right time.
+ATC is designed to be an intelligent platform that tracks all outgoing 
communications and delivers the communication through the right channe to the 
right member at the right time.
 
 <img src="/img/{{site.version}}/case-studies/linkedin-atc-samza-pipeline.png" 
alt="architecture" style="max-width: 80%; height: auto;" 
onclick="window.open(this.src)"/>
 
-It has a three main components,
+Any service that wants to send out a notification to members writes its 
request to a Kafka topic, which ATC later reads from. The ATC platform 
comprises of three components: <br/>
+
+_Partitioners_ read incoming communication requests from Kafka and distribute 
them across _Pipeline_ instances based on the hash of the recipient. It also 
does some
+filtering early-on to drop malformed messages. <br/><br/>
+The _Relevance processors_ read personalized machine-learning models from 
Kafka and stores them in Samza's state store for evaluating them later. It uses 
them to score incoming requests and determine the right channel for the 
notification (eg: drop it vs sending an email vs push notification) . <br/><br/>
+The _ATC pipeline_ processors aggregate the output from the _Relevance_ and 
the _Partitioners_, thereby making the final call on the notification. It 
heavily leverages Samza's local state to batch and aggregate notifications. It 
decides the frequency of notifications - duplicate notifications are merged, 
notifications are capped at a certain threshold. The _Pipeline_ also implements 
a _scheduler_ on top of Samza's local-store so that it can schedule messages 
for delivery later. As an example, it may not be helpful to send a 
push-notification at midnight. <br/><br/>
 
-- **Partitioner**: Partition communication requests, metrics based on user
-- **Pipeline**: Handle partitioned communication requests which performs 
aggregation and consults with the relevance model to determine delivery time
-- **Relevance processor**: Provide insights on how relevant is the content to 
the user, the right delivery time, etc.
 
-ATC, leverages Samza extensively and uses a lot of features including but not 
limited to:
+ATC uses several of Samza features:
 
-- **Stateful processing**: The ML models in the relevance module are stored 
locally in RocksDb which are updated realtime time based on user feedback.
-- **Async APIs and Multi-threading**: Samzaâs multi-threading and Async APIs 
allows ATC to perform remote calls with high-throughput. This helps bring down 
the 90th percentile (P90) end-to-end latency for end to end latency for push 
notifications from about 12 seconds to about 1.5 seconds.
-- **Host affinity**: Co-location of local state stores along with host 
awareness helps ATC to achieve zero downtime and instant recovery.
+**1.Stateful processing**: The ML models in the relevance module are stored 
locally in RocksDb and are updated realtime time based on user feedback. 
<br/><br/>
+**2.Async APIs and Multi-threading**: Samzaâs multi-threading and Async APIs 
allow ATC to perform remote calls with high throughput. This helps bring down 
the 90th percentile end-to-end latency for push notifications. <br/><br/>
+**3.Host affinity**: Samza's incremental checkpointing and host-affinity 
enable ATC to achieve zero downtime during upgrades and instant recovery during 
failures. <br/><br/>
 
 Key Samza Features: *Stateful processing*, *Async API*, *Host affinity*
 

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_case-studies/optimizely.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/optimizely.md b/docs/_case-studies/optimizely.md
index 5df32c4..c93c4b2 100644
--- a/docs/_case-studies/optimizely.md
+++ b/docs/_case-studies/optimizely.md
@@ -5,7 +5,7 @@ title: Real Time Session Aggregation
 study_domain: optimizely.com
 priority: 2
 menu_title: Optimizely
-exclude_from_loop: true
+exclude_from_loop: false
 excerpt_separator: <!--more-->
 ---
 <!--
@@ -29,29 +29,31 @@ Real Time Session Aggregation
 
 <!--more-->
 
-Optimizely is a worldâs leading experimentation platform, enabling 
businesses to 
+Optimizely is the worldâs leading experimentation platform, enabling 
businesses to 
 deliver continuous experimentation and personalization across websites, mobile 
 apps and connected devices. At Optimizely, billions of events are tracked on a 
-daily basis. Session metrics are among the key metrics provided to their end 
user 
-in real time. Prior to introducing Samza for their realtime computation, the 
+daily basis and session metrics are provided to their users in real-time. 
+
+Prior to introducing Samza for their realtime computation, the 
 engineering team at Optimizely built their data-pipeline using a complex 
-[Lambda architecture] (http://lambda-architecture.net/) leveraging 
-[Druid and Hbase] 
(https://medium.com/engineers-optimizely/building-a-scalable-data-pipeline-bfe3f531eb38).
 
-As business requirements evolve, this solution became more and more 
challenging.
+[Lambda architecture](http://lambda-architecture.net/) using 
+[Druid and 
Hbase](https://medium.com/engineers-optimizely/building-a-scalable-data-pipeline-bfe3f531eb38).
 
+Since some session metrics were computed using Map-Reduce jobs, they 
+could be delayed up to hours after the events are received. As business 
requirements evolved, 
+this solution became more and [more 
challenging](https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-1-aed2051dd7a3)
 to scale. 
+
 
-The engineering team at Optimizely decided to move away from Druid and focus 
on 
-HBase as the store, and introduced stream processing to pre-aggregate and 
-deduplicate session events. In their solution, every session event is tagged 
-with an identifier for up to 30 minutes; upon receiving a session event, the 
-Samza job updates session metadata and aggregates counters for the session 
-that is stored in a local RocksDB state store. At the end of each one-minute 
-window, aggregated session metrics are ingested to HBase. With the new solution
+The engineering team at Optimizely turned to stream processing to reduce 
latencies. 
+In their solution, each up-stream client associates a _sessionId_ with the 
events it generates. Upon receiving each event, the Samza job extracts various
+fields (e.g. ip address, location information, browser version, etc) and 
updates aggregated metrics
+for the session. At the end of a time-window, the merged metrics for that 
session are ingested to HBase. 
 
--   The median query latency was reduced from 40+ ms to 5 ms
--   Session metrics are now available in realtime
--   HBase query response time is improved due to reduced write-rate
--   HBase storage requirement are drastically reduced
--   Lower development effort thanks to out-of-the-box Kafka integration
+With the new solution <br/>
+-   The median query latency was reduced from 40+ ms to 5 ms <br/>
+-   Session metrics are now available in real-time <br/>
+-   Write-rate to Hbase is reduced, since the metrics are pre-aggregated by 
Samza<br/>
+-   Storage requirements on Hbase are drastically reduced <br/>
+-   Lower development effort thanks to out-of-the-box Kafka integration <br/>
  
 Here is a testimonial from Optimizely
 
@@ -61,17 +63,16 @@ for analysis. Apache Samza has been a great asset to 
Optimizely's Event
 ingestion pipeline allowing us to perform large scale, real time stream 
 computing such as aggregations (e.g. session computations) and data enrichment 
 on a multiple billion events / day scale. The programming model, durability 
-and the close integration with Apache Kafka fit our needs perfectlyâ said 
-Vignesh Sukumar, Senior Engineering Manager at Optimizelyâ
+and the close integration with Apache Kafka fit our needs perfectlyâ says 
+Vignesh Sukumar, Senior Engineering Manager at Optimizely.
 
-In addition, stream processing is also applied to other use cases such as 
-data enrichment, event stream partitioning and metrics processing at 
Optimizely.
+In addition to this case-study, Apache Samza is also leveraged for other 
usecases such as 
+data-enrichment, re-partitioning of event streams and computing realtime 
metrics etc.
 
 Key Samza features: *Stateful processing*, *Windowing*, *Kafka-integration*
 
 More information
 
--   
[https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-1-aed2051dd7a3](https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-1-aed2051dd7a3)
-c9715fbc85f973907807cccc26c9d7d3ed983df
--   
[https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-2-b596350a7820](https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-2-b596350a7820)
+-   [From batching to streaming at Optimizely - Part 
1](https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-1-aed2051dd7a3)
+-   [From batching to streaming at Optimizely - Part 
2](https://medium.com/engineers-optimizely/from-batching-to-streaming-real-time-session-metrics-using-samza-part-2-b596350a7820)
     
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_case-studies/redfin.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/redfin.md b/docs/_case-studies/redfin.md
index 4341f18..d03be86 100644
--- a/docs/_case-studies/redfin.md
+++ b/docs/_case-studies/redfin.md
@@ -35,34 +35,34 @@ emails, scheduled digests and push notifications. Thousands 
of emails are delive
 to customers every minute at peak. 
 
 The notification system used to be a monolithic system, which served the 
company 
-well. However, as business grew and requirements evolved, it became harder and 
+well. However, as the business grew and requirements evolved, it became harder 
and 
 harder to maintain and scale. 
 
 ![Samza pipeline at Redfin](/img/case-studies/redfin.svg)
 
-The engineering team at Redfin decided to replace 
-the existing system with Samza primarily for Samzaâs performance, 
scalability, 
-support for stateful processing and Kafka-integration. A multi-stage stream 
-processing pipeline was developed. At the Identify stage, external events 
-such as new Listings are identified as candidates for new notification;
-then potential recipients of notifications are determined by analyzing data in 
-events and customer profiles, results are grouped by customer at the end of 
-each time window at the Match Stage; once recipients and notification outlines 
are 
-identified, the Organize stage retrieves adjunct data necessary to appear in 
each 
-notification from various data sources by joining them with notification and 
-customer profiles, results are stored/merged in local RocksDB state store; 
finally 
-notifications are formatted at the Format stage and sent to notification
- delivery system at the Notify stage. 
+The engineering team at Redfin decided to replace the existing system with 
Samza 
+primarily for Samzaâs performance, scalability,  support for stateful 
processing and 
+Kafka-integration. A multi-stage stream 
+processing pipeline was developed. At the _Identify_ stage, external events 
+such as new listings are identified as candidates for sending a new 
notification;
+Then potential recipients of notifications are determined by analyzing data in 
+the events and customer profiles. The results are grouped by customer at the 
end of 
+each time window during the _Match_ Stage. Once notifications and recipients 
are 
+identified, the _Organize_ stage further joins them with additional 
data-sources (eg: 
+notification settings, customer profiles) leveraging Samza's support for local 
state. 
+It makes heavy use of RocksDB to store and merge individual notifications 
before sending
+them to customers. Finally, the notifications are formatted at the _Format_ 
stage and 
+sent to the delivery system at the _Notify_ stage.
 
-With the new notification system
+With the new notification system based on Apache Samza, Redfin observed that
 
--   The system is more performant and horizontally scalable
 -   It is now easier to add support for new use cases
--   Reduced pressure on other system due to the use of local RocksDB state 
store
--   Processing stages can be scaled individually
+-   The new system is more performant and horizontally scalable
+-   Reduced pressure on downstream services due to the use of local RocksDB 
state store
+-   Processing stages can be scaled individually since they are isolated
 
-Other engineering teams at Redfin are also using Samza for business metrics 
-calculation, document processing, event scheduling.
+In addition to the notifications platform, other engineering teams at Redfin 
also use Samza for 
+calculating business metrics, document processing, event scheduling etc.,
 
 Key Samza Features: *Stateful processing*, *Windowing*, *Kafka-integration*
 

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_case-studies/slack.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/slack.md b/docs/_case-studies/slack.md
index 6bfc023..bc5b5fd 100644
--- a/docs/_case-studies/slack.md
+++ b/docs/_case-studies/slack.md
@@ -28,28 +28,25 @@ How Slack monitors their infrastructure using Samza's 
streaming data-pipelines?
 
 <!--more-->
 
-Slack is a cloud based company that offers collaboration tools and services to 
increase productivity. With a rapidly growing user base, and a daily active 
users north of 8 million, there is an imminent need to react quickly to issues 
and proactively monitor the health of the application. With a lack of existing 
monitoring solution, the team went on to build a new data pipeline with the 
following requirements
+Slack is a cloud based company that offers collaboration tools and services to 
increase productivity. With a rapidly growing user base and a daily active 
users north of 8 million, they needed to react quickly to issues and 
proactively monitor the application health. For this, the team went on to build 
a new monitoiring solution using Apache Samza with the following requirements:
 
-- Near realtime alerting
-- Fault tolerant and high throughput data pipeline
-- Process billions of metric data, logs and derive timely insights on the 
health of application
-- Extend the pipeline to other use cases such as experimentation, performance 
etc.
+- Near real-time alerting to quickly surface issues
+- Fault-tolerant processing of data streams
+- Process billions of events from metrics, logs and derive timely insights on 
application health
+- Ease of extensibility to other use cases like experimentation
 
 <img src="/img/{{site.version}}/case-studies/slack-samza-pipeline.png" 
alt="architecture" style="max-width: 80%; height: auto;" 
onclick="window.open(this.src)"/>
 
-The engineering team built a data platform using Apache Samza. It has three 
main components,
+The engineering team at Slack built their data platform using Apache Samza. It 
has three types of Samza jobs - _Routers_, _Processors_ and _Converters_.
 
-- **Router**: Deserialize Kafka events and add instrumentation
-- **Processor**: Registers with the routers to process subset of message types 
and performs aggregation
-- **Converter**: Enrich the processed data before piping the data to analytics 
store.  
+All services at Slack emit their logs in a well-defined format, which end up 
in a Kafka cluster. The logs are processed by a fleet of Samza jobs called 
_Routers_. The routers deserialize
+incoming log events, decorate them and add instrumentation on top of them. The 
output of the router is processed by another pipeline, _Processors_ which 
perform aggregations using Samza's state-store. Finally, the processed results 
are enriched by the last stage - _Coverters_, which pipe the data into Druid 
for analytics and querying. Performance anomalies trigger an alert to a 
slackbot for further action. Slack built the data-platform to be extensible, 
thereby enabling other teams within the company to build their own applications 
on top of it.
 
-The clients and backend servers channels the logs and exceptions through Kafka 
to content routers a.k.a samza partitioners. The partitioned data then flows 
through processors where it is stored in RocksDb before being joined with other 
metrics data. The enriched data is stored in druid which powers analytics 
queries and also acts as a trigger to alert slackbot notifications.
-
-Other notable use case includes experimentation framework that leverage the 
data pipeline to track the results of A/B testing in near realtime. The metrics 
data is joined with the exposure table (members part of the experiment) to 
derive insights on the experiment. The periodic snapshots of RocksDb is also 
used to perform data quality check with the batch pipeline.
+Another noteworthy use-case powered by Samza is their experimentation 
framework. It leverages a data-pipeline to measure the results of A/B testing 
in near real-time. The pipeline uses Samza to join a stream of 
performance-related metrics with additional data on experiments that the 
customer was a part of. This enables Slack to learn how each experiment affects 
their overall customer experience. 
 
 Key Samza Features: *Stateful processing*, *Join*, *Windowing*
 
 More information
 
 - [Talk: Streaming data pipelines at 
Slack](https://www.youtube.com/watch?v=wbS1P9ehgd0)
-- [Slides: Streaming data pipelines at 
Slack](https://speakerdeck.com/vananth22/streaming-data-pipelines-at-slack)
+- [Slides: Streaming data pipelines at 
Slack](https://speakerdeck.com/vananth22/streaming-data-pipelines-at-slack)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_case-studies/tripadvisor.md
----------------------------------------------------------------------
diff --git a/docs/_case-studies/tripadvisor.md 
b/docs/_case-studies/tripadvisor.md
index 80ee33b..b3887c6 100644
--- a/docs/_case-studies/tripadvisor.md
+++ b/docs/_case-studies/tripadvisor.md
@@ -27,21 +27,20 @@ Hedwig - Converting Hadoop M/R ETL systems to Stream 
Processing at TripAdvisor
 
 <!--more-->
 
-TripAdvisor is one of the worldâs largest travel website that provides hotel 
+TripAdvisor is one of the worldâs largest travel websites that provides 
hotel 
 and restaurant reviews, accommodation bookings and other travel-related 
-content. It produces and processes billions events processed everyday 
+content. It produces and processes billions events everyday 
 including billing records, reports, monitoring events and application 
 notifications.
 
-Prior to migrating to Samza, TripAdvisor used Hadoop to ETL its data. Raw 
-data was rolled up to hourly and daily in a number of stages with joins 
-and sliding windows applied, session data is then produced from daily data. 
-About 300 million sessions are produced daily. With this solution, the 
+Prior to migrating to Samza, TripAdvisor used Hadoop to ETL its data. In this 
model, raw 
+data was rolled up to hourly and daily snapshots in a number of stages with 
joins 
+and sliding windows applied. Session data was then extracted from the daily 
snapshots. 
+About 300 million sessions were produced daily. With this solution, the 
 engineering team were faced with a few challenges
   
--   Long lag time to downstream that is business critical
+-   Long lag time to produce business-critical metrics
 -   Difficult to debug and troubleshoot due to scripts, environments, etc.
--   Adding more nodes doesnât help to scale
  
 The engineering team at TripAdvisor decided to replace the Hadoop solution 
 with a multi-stage Samza pipeline. 
@@ -49,11 +48,11 @@ with a multi-stage Samza pipeline.
 ![Samza pipeline at TripAdvisor](/img/case-studies/trip-advisor.svg)
 
 In the new solution, after raw data is first collected by Flume and ingested 
-through a Kafka cluster, they are parsed, cleansed and partitioned by the
-Lookback Router; then processing logic such as windowing, grouping, joining, 
-fraud detection are applied by the Session Collector and the Fraud Collector, 
-RocksDB is used as the local store for intermediate states; finally the 
Uploader 
-uploads results to HDFS, ElasticSearch, RedShift and Hive. 
+through a Kafka cluster, it is parsed, cleansed and re-partitioned by the
+_Lookback Router_; then processing logic such as windowing, grouping, joining, 
+fraud detection are applied by the _Session Collector_ and the _Fraud 
Collector_, 
+The pipeline uses Samza's RocksDB store to perform stateful aggregations; 
finally the 
+_Uploader_ writes results to ElasticSearch, RedShift and Hive.
 
 The new solution achieved significant improvements:
 
@@ -66,6 +65,4 @@ Key Samza features: *Stateful processing*, *Windowing*, 
*Kafka-integration*
 
 More information
 
--   
[https://www.youtube.com/watch?v=KQ5OnL2hMBY](https://www.youtube.com/watch?v=KQ5OnL2hMBY)
--   [https://www.tripadvisor.com/](https://www.tripadvisor.com/)
-    
\ No newline at end of file
+-   [Converting Hadoop M/R ETL to use Stream Processing at 
TripAdvisor](https://www.youtube.com/watch?v=KQ5OnL2hMBY)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/_powered-by/linkedin.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/linkedin.md b/docs/_powered-by/linkedin.md
index 8b5f6ca..0bd53d6 100644
--- a/docs/_powered-by/linkedin.md
+++ b/docs/_powered-by/linkedin.md
@@ -1,7 +1,11 @@
 ---
 name: LinkedIn
 domain: linkedin.com
+<<<<<<< Updated upstream
 priority: 0
+=======
+priority: D
+>>>>>>> Stashed changes
 ---
 <!--
    Licensed to the Apache Software Foundation (ASF) under one or more

http://git-wip-us.apache.org/repos/asf/samza/blob/7d3eb08b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/core-concepts/core-concepts.md 
b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
index c4e5c21..c1724fb 100644
--- a/docs/learn/documentation/versioned/core-concepts/core-concepts.md
+++ b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
@@ -1,4 +1,4 @@
----
+.png
 layout: page
 title: Core concepts
 ---
@@ -43,7 +43,7 @@ _**Unified API:**_ Use a simple API to describe your 
application-logic in a mann
 
 *Massive scale:* Battle-tested on applications that use several terabytes of 
state and run on thousands of cores. It [powers](/powered-by/) multiple large 
companies including LinkedIn, Uber, TripAdvisor, Slack etc. 
 
-Next, we will introduce Samzaâs terminology. You will realize that it is 
extremely easy to get started with [building](/quickstart/{{site.version}}) 
your first stream-processing application. 
+Next, we will introduce Samzaâs terminology. You will realize that it is 
extremely easy to [get started](/quickstart/{{site.version}}) with building 
your first application. 
 
 
 ## Streams, Partitions

samza git commit: Fix case-studies for LinkedIn, Optimizely, Tripadvisor, Slack. Re-word some of them.

Reply via email to