A Simple Apache Spark Demo

Apache Spark is a data processing framework that can quickly perform processing tasks on very large data sets, and can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools

About this example

In this post I am sharing a simple Apache Spark example project. The source code used for this example is available here: https://github.com/jobinesh/apache-spark-examples.git

Here is the quick overview of the modules that you may find in this project

  • spark-job-common :  All common classes that you need for building a Spark job are parked here. This approach may help you to avoid boilerplate code in your Spark job implementation
  • spark-job-impl : A classic word count Spark  example is available here.   This class may help you to understand the structuring of the source and usage of common classes from spark-job-common module
  • spark-job-launcher : The SparkLauncher helps you to start Spark applications programmatically.

The Spark application that you may find in the spark-job-impl module reads the text from src/main/resources/demo.txt file
and generates an output file with total count for each word. 
The output directory location is configured in src/main/resources/application.conf

How to run this example?

The detailed steps are available here: 

Enjoy !
   

Comments

  1. Thank you, this blog is awesome and super. I really love your article. Thanks once again,
    Germany VPS Server

    ReplyDelete
  2. I will truly value the essayist's decision for picking this magnificent article fitting to my matter.Here is profound depiction about the article matter which helped me more.
    data science courses in malaysia

    ReplyDelete
  3. Great blog!! I hope you'll post a Data Science blog here.
    data science course in delhi

    ReplyDelete
  4. Hello there to everyone, here everybody is sharing such information, so it's fussy to see this webpage, and I used to visit this blog day by day
    hrdf training course

    ReplyDelete
  5. You re in motivation behind fact an on-target site administrator. The site stacking speed is amazing. It kind of feels that you're doing a specific trick. What's more, The substance is a masterpiece. you have done a marvelous development concerning this issue!
    360DigiTMG

    ReplyDelete
  6. I was looking at a portion of your posts on this site and I consider this site is really enlightening! Keep setting up..
    https://360digitmg.com/course/certification-program-on-big-data-with-hadoop-spark

    ReplyDelete

Post a Comment

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of my employer.