Apache Spark Programming with Databricks 101

Apache Spark Programming with Databricks 101

What is Apache Spark?

Databricks defines Apache Spark as a lightning-fast unified analytics engine for big data and machine learning. Since its release, Apache Spark, the unified analytics engine, has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at a massive scale, collectively processing multiple petabytes of data on clusters of over 8,000 nodes. It has quickly become the largest open-source community in big data, with over 1000 contributors from 250+ organizations.

What is the Apache Spark Programming with Databricks all about?

This course uses a case study-driven approach to explore the fundamentals of Spark Programming with Databricks, including Spark architecture, the DataFrame API, query optimization, and Structured Streaming. First, you will become familiar with Databricks and Spark, recognize their major components, and explore datasets for the case study using the Databricks environment. After ingesting data from various file formats, you will process and analyze datasets by applying a variety of DataFrame transformations, Column expressions, and built-in functions. Lastly, you will execute streaming queries to process streaming data and highlight the advantages of using Delta Lake.

What is the duration of the course?

The course is two days long.

Course Objectives:

  • Upon completion of the course, students should be able to meet the following objectives:
  • Define the major components of Spark architecture and execution hierarchy
  • Describe how DataFrames are built, transformed, and evaluated in Spark
  • Apply the DataFrame API to explore, preprocess, join, and ingest data in Spark
  • Apply the Structured Streaming API to perform analytics on streaming data
  • Navigate the Spark UI and describe how the catalyst optimizer, partitioning and caching affect Spark’s execution performance

Target Audience:

  • Data engineer
  • Data scientist
  • Machine learning engineer
  • Data architect

Prerequisites:

  • Familiarity with basic SQL concepts (select, filter, group by, join, etc.)
  • Beginner programming experience with Python or Scala (syntax, conditions, loops, functions)

Additional Notes:

All ​participants ​will ​need-

  • An ​internet ​connection
  • A ​device ​that is compliant with the following supported internet browsers ​

NOTE: GoToTraining ​is ​our chosen online ​platform ​through which the ​class ​will ​be ​delivered and ​prior ​to ​attendance, ​each ​registrant ​will ​receive ​GoToTraining ​log-in ​instructions.

Course Outline:

Day 1: DataFrames

  • Introduction: Databricks Ecosystem, Spark Overview, Case Study
  • Databricks Platform: Databricks Concepts, Databricks Platform, Lab
  • Spark SQL: Spark SQL, DataFrames, SparkSession, Lab
  • Reader and Writer: Data Sources, DataFrameReader/Writer, Lab

Day 2: DataFrames and Transformations

  • DataFrame and Column: Columns and Expressions, Transformations, Actions, Rows, Lab
  • Aggregation: Groupby, Grouped Data Methods, Aggregate Functions, Math Functions, Lab
  • Datetimes: Dates and Timestamps, Datetime Patterns, Date Functions, Lab
  • Complex types: String Functions, Collection Functions
  • Additional Functions: Non-aggregate Functions, Na Functions, Lab

Day 3: Transformations and Spark Internals

  • Transformations: UDFs: UDFs, Vectorized UDFs, Performance, Lab
  • Spark Architecture: Spark Cluster, Spark Execution, Shuffling, Query Optimization, Catalyst Optimizer, Adaptive Query Execution
  • Query Optimization: Query Optimization, Catalyst Optimizer, Adaptive Query Execution
  • Partitioning: Partitions vs. Cores, Default Shuffle Partitions, Repartition, Lab
  • Review: Review of lab

Day 4: Structured Streaming and Delta

  • Streaming Query: Streaming Concepts, Streaming Query, Transformations, Monitoring, Lab
  • Processing Streams: Lab
  • Delta Lake: Delta Lake Concepts, Batch and Streaming

Conclusion:

Are you looking to learn the mechanics of an analytics platform that accelerates innovation by unifying data science, engineering, and business? Then look no further. The Apache Spark Programming with Databricks training course will shed light on the basics of creating Spark jobs, loading data, and working with data.

To enroll, contact P2L today!

Microsoft Excel Intermediate

Microsoft Excel: Intermediate

Microsoft Excel’s pivotal role in many sectors gives Microsoft Excel the highest ranking of all the computer programs around. Excel is the most widely used spreadsheet program in business, classwork, and even personal data organization. Excel is an essential tool to perform formula-based arithmetic and calculations and other activities involving mathematical calculation. Because of Excel’s usefulness and ability to serve as a visual basis for a variety of applications, many businesses, personal and institutional enterprises have adopted it.

Why is Microsoft Excel Important?

Here are some of the key departments where MS Excel is used daily.

  1. Creating graphs

It has the ability to produce a variety of different graphs, which are used by different departments to visually represent statistical data. The formulas and procedures in the package make it easy to create charts. Compared to other graphing programs, Excel is much more affordable considering its many roles and many different things you can do with it.

  1. Programming

Programming-wise, MS Excel is capable of supporting almost every programming language used for macro creation. As a result, complex functions can be solved easily, increasing the efficiency of programming.

 

  1. Organizing important data

A data set is a raw, unprocessed piece of information that must be organized and stored well. With Excel, users can organize their data by setting up tables, which also provide updating keys. The benefit of storing data in excel is felt by administrators who are constantly dealing with so much data to update regularly. Administrators will be able to observe the progress of single or combined statistics such as report trends and product complexity using Excel tables.

 

If this sounds like a career path that you would be interested in, P2L has the perfect course for you: the Intermediate Microsoft Excel Course.

Course: Intermediate Microsoft Excel 2019

This training class is designed for students seeking to expand their skill set by learning to use advanced formulas, lists, and illustrations. Additionally, students will work with charts and advanced formatting, including styles.

Profile of the audience

Students who have basic skills in Microsoft Excel 2019 and want to learn more advanced skills, or students who want to learn these topics in the 2019 interface, will benefit from this course.


Skills gained

  • Make use of formulas and functions.
  • Modify charts and create new ones.
  • Manage, convert, sort, and filter lists.
  • Edit illustrations in a worksheet.
  • Use tables to organize information.

Prerequisites

Students must have the following skills before attending this course:

  • Knowledge of Excel at a basic level


Last but not least, the knowledge of Microsoft Excel is vital for proficiency in most modern businesses. Most organizations record their products, programs, and activities in a systematic and up-to-date way. Therefore, Excel macro developers or creators are regarded as valuable resources in a given organization.

Don’t wait any longer and contact P2L to enroll today!

Veeam Backup & Replication

Veeam Backup & Replication: Backing Up Data Made Easier

What is Veeam® Backup & Replication?

Veeam help center defines Veeam Backup & Replication as comprehensive data protection and disaster recovery solution. With Veeam Backup & Replication, you can create image-level backups of virtual, physical, cloud machines and restore from them. The technology used in the product optimizes data transfer and resource consumption, which helps to minimize storage costs and the recovery time in case of a disaster.

Veeam Backup & Replication provides a centralized console for administering backup/restore/replication operations in all supported platforms (virtual, physical, cloud). Also, the console allows you to automate and schedule routine data protection operations and integrate with solutions for alerting and generating compliance reports.

What are Veeam® Backup & Replication’s main features?

As per the Veeam help center, the main functionality of Veeam Backup & Replication includes:

  • Backup: creating image-level backups of virtual, physical, cloud machines and backups of NAS share files.
  • Restore: performing a restore from backup files to the original or a new location. Veeam Backup & Replication offers several recovery options for various disaster recovery scenarios, including instant VM recovery, image-level restore, file-level restore, restore of application items, and so on.
  • Replication: creating an exact copy of a VM and maintaining the copy in sync with the original VM.
  • Continuous Data Protection (CDP): replication technology that helps you protect mission-critical VMs and reach recovery point objective (RPO) up to seconds.
  • Backup Copy: copying backup files to a secondary repository.
  • Storage Systems Support: backing up and restoring VMs using capabilities of native snapshots created on storage systems.
  • Tape Devices Support: storing copies of backups in tape devices.
  • Recovery Verification: testing VM backups and replicas before recovery.

What is the Veeam® Backup & Replication™ v11: Architecture and Design course about?

The two-day, Veeam® Backup & Replication™ v11: Architecture and Design training course focuses on teaching IT professionals how to effectively architect a Veeam solution through attaining technical excellence following the Veeam Architecture Methodology used by Veeam’s own Solution Architects. During the two days, attendees will explore requirement gathering and infrastructure assessment goals and use that information to design Veeam solutions within team exercises. Attendees will analyze considerations when turning logical designs into physical designs and describe the obligations to the implementation team that will implement that design. Other topics covered will include security, governance, and validation impacts when architecting a Veeam solution and how to build these into the overall design. Attendees should expect to contribute to team exercises, present designs, and defend decision-making.

Certification:

Completion of this course satisfies the prerequisite for taking the Veeam Certified Architect (VMCA) exam, the highest level of Veeam certification. VMCA certification proves knowledge of architecture and design concepts, highlighting the level of skill required to efficiently architect a Veeam solution in a range of real-world environments.

Target Audience:

Senior Engineers and Architects responsible for creating architectures for Veeam environments.

Prerequisites:

Ideally, VMCE certified, attendees should have extensive commercial experience with Veeam and a broad sphere of technical knowledge of servers, storage, networks, virtualization, and cloud environments.

Objectives:

After completing this course attendees should be able to:

  • Design and architect a Veeam solution in a real-world environment
  • Describe best practices, review an existing infrastructure, and assess business/project requirements
  • Identify relevant infrastructure metrics and perform component (storage, CPU, memory) quantity sizing
  • Provide implementation and testing guidelines in line with designs
  • Innovatively address design challenges and pain points, matching appropriate Veeam Backup & Replication features with requirements

Course outline:

  • Introduction
  • Discovery
  • Conceptual design
  • Logical design
  • Physical/tangible design
  • Implementation and Governance
  • Validation and Iteration

Conclusion:

If you’re looking for a comprehensive course that helps you design and architect a Veeam solution in a real-world environment, then this is the ideal course for you.

To enroll, contact P2L today!

The Mirantis Cloud Course You Need To Succeed

What Is the Mirantis Cloud Platform?

The Mirantis Cloud-Native Platform provides a holistic cloud experience for complete app and DevOps portability, a single pane of glass, and fully automated full-stack lifecycle management with continuous updates.

What is Cloud Native Computing?

Cloud-native computing entails the use of cloud computing software for building and running scalable applications in dynamic, changing environments such as public clouds, private clouds, and hybrid clouds.

The platform embraces modern approaches, such as serverless and microservices. Manages without compromising quality or security while quickly writing, building, deploying, and deploying.

If this sounds like something you would be interested in, then you’ve come to the right place! P2L is proud to announce that it will be offering one of the top Mirantis Cloud-Native courses to meet all your cloud-related needs. 

Mirantis – CN252: Cloud-Native Development

Bootcamp (On-Demand)


This course will allow you to learn the core skills you need to develop high-performance, secure containerized applications and orchestrate them on Kubernetes, as well as advanced techniques for streamlining the container development process, instrumenting containers for production systems, and building containerized continuous integration pipelines. By accelerating the containerization process for developers and DevOps teams, this bundle allows them to fully utilize all containerization has to offer.



Who Can Benefit

Participants in this course should have the following skills:

Motivators: Learning containerization and Kubernetes quickly before developing a container-native application

Containerized applications and continuous integration.

Role: Developers, architects of applications, and developers of operations

 

Prerequisites
The following requirements will lead to the success of the student in this course:

Knowledge of the bash shell

Navigation and manipulation of filesystems

Editing text on the command line with vim or nano

Curl and ping which are commonly used as tools


If you are eager to learn more or you are planning on enrolling in the course, contact P2L today!

Veeam Availability Suite™ v11

Manage Data with Veeam Availability Suite™ v11

What is Veeam®?

As per Global Security Mag, Veeam delivers Backup as a Service (BaaS) and Disaster Recovery as a Service (DRaaS) to the market thanks to partnerships with leading cloud and managed service providers in over 180 countries. To ensure these services are seamlessly integrated into V11, NEW Veeam Service Provider Console v5 offers service providers a web-based platform for centralized management, monitoring, and customer self-service access of data protection operations. Version 5 now features expanded backup management for Linux and Mac, monitoring and reporting cloud-native AWS and Azure backups, enhanced security with multi-factor authentication (MFA), and powerful insider protection services.

What is Veeam Availability Suite™ v11?

As per Global Security Mag, New Veeam Availability Suite™ v11 combines the expansive backup and recovery features of Veeam Backup & Replication v11 with the monitoring, reporting, and analytics capabilities of Veeam ONE™ v11, offering businesses complete data protection and visibility enabling customers to achieve unparalleled data availability, visibility, and governance across multi-cloud environments. Furthermore, adding Veeam DR Pack, which includes Veeam Disaster Recovery Orchestrator (formerly Veeam Availability Orchestrator), to a new or previous purchase of either Veeam Availability Suite or Veeam Backup & Replication provides site recovery automation and DR testing to ensure business continuity.

What is the Veeam Availability Suite™ v11 course all about?

The Veeam® Availability Suite™ v11: Configuration and Management training course is a technical deep-dive focused on teaching IT professionals the skills to configure, manage and support a Veeam Availability Suite v11 solution. With extensive hands-on labs, the class enables administrators and engineers to effectively manage data in an ever-changing technical and business environment, bringing tangible benefits to businesses in the digital world.

What is the duration of the Veeam Availability Suite™ v11 course?

The course is three days long.

Skills Gained:

After completing this course, attendees should be able to:

  • Describe Veeam Availability Suite components usage scenarios and relevance to your environment.
  • Effectively manage data availability in on-site, off-site, cloud, and hybrid environments.
  • Ensure both Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) are met.
  • Configure Veeam Availability Suite to ensure data is protected effectively.
  • Adapt with an organization’s evolving technology and business data protection needs.
  • Ensure recovery is possible, effective, efficient, secure, and compliant with business requirements.
  • Provide visibility of the business data assets, reports, and dashboards to monitor performance and risks.

Target audience:

This course is suitable for anyone responsible for configuring, managing, or supporting a Veeam Availability Suite v11 environment.

Prerequisites:

Students should be experienced professionals with solid knowledge of servers, storage, networking, and virtualization.

  • Recommended: Veeam Availability Suite

Course Details:

  • Introduction
  • Building backup capabilities
  • Building replication capabilities
  • Secondary backups
  • Advanced repository capabilities
  • Protecting data in the cloud
  • Restoring from backup
  • Recovery from replica
  • Testing backup and replication
  • Veeam Backup Enterprise Manager and Veeam ONE
  • Configuration backup

Conclusion:

With most organizations adopting multi-cloud ecosystems and workers increasingly operating remotely, it has become harder to manage and control data than ever before. To ease the process, formulate a successful backup strategy, and create, modify, optimize, and delete backup jobs, opt for the newly modified Veeam Availability Suite™ v11.

If you’re looking for a course that helps you understand the functions of Veeam Availability Suite™ v11, then this is the perfect one for you.

To enroll, contact P2L today!