AWS Data Ecosystem Architecture

We put organizational change management in place to gain employee acceptance and support and ensure ultimate success of new IT initiatives.

SERVICES AND TECHNOLOGIES

  • Modernization of customer analytics data environment
  • !m scrum master leading integrated Marriott & !m architecture team
  • !m designing architecture and deciding standards for S3-based data lake including data ingestion, refinement, governance
  • The architecture will support streaming as well as batch data processing and pipelines, and end user analysis and reporting capabilities.
  • AWS for storage, cloud-provider-agnostic architecture imperative

CLIENT’S CHALLENGE

  • Absence of Enterprise Data Governance: Organization was implementing a new could data platform and recognized the opportunity to implement a business focused initiative for governing data
  • Compliance with Global Data Privacy Regulations: Needed capability to quickly identify where customer data was located within their ecosystem and how it was being used. No central data catalog existed
  • Key person dependencies: Key individuals within the organization held most of the knowledge about data
  • No capability to consistently measure and expose data quality: Individually addressed quality issues in a reactive, siloed manner versus long-term fixes for the entire organization
  • Multiple paths for managing reference data: This $20B hospitality company was implementing a new cloud-based Modern Data Platform and lacked a solution to enable self-managed, consolidated reference data
  • Existing process manual and time-consuming: The existing reference data process was Excel spreadsheet-based. It included a two-week cycle from submission of changes to reference data to the publishing of that data into the production environment
  • Access to reference datasets: Managing reference datasets within the new platform with a two-week delay would significantly impact the desired agility and benefits provided by modernization

IMPACT MAKERS’ Approach

  • Established operational data governance framework to manage data in the Cloud Data Platform
  • Developed architecture and design for implementation of a platform data catalog
  • Designed data quality measurement and remediation solution for the platform
  • Developed prioritized program plan for end-to-end implementation which included multiple iterations to apply incremental learnings
  • Gathered requirements from business data stewards and IT technical staff
  • Architected solution to satisfy the end-user requirement for a solid, reliable solution that was less dependent on IT
  • Evaluated several tool alternatives before deciding to leverage an existing relationship with the client’s MDM vendor
  • Leveraged Amazon Web Services and designed a series of serverless solutions to move data between the data lake and the reference data management (RDM) tool
  • Included a bulk load service capability to load large datasets into the RDM tool instead of manually keying them in

Impact & Results

  • Extensible policies and processes facilitated a streamlined expansion of the initial platform and enabled efficient ingestion and access to data
  • Identified, documented, and staffed key roles required for implementing data governance of the platform
  • Business process analysis and identification of fundamental governance needs aided in the selection of tool suite
  • Central data catalog capability provided a single repository of data knowledge
  • Measuring and exposing data quality provided a foundation for remediation framework
  • Reduced time to manage reference data from days to minutes/hours, and the task is now solely performed by a business userdata before moving new reference data to production
  • Enabled client to react more quickly to market changes by reducing the amount of time it takes to approve new reference data
  • Utilized a data delivery pipeline in the solution provided a robust, IT-supported solution and complied with the client’s best practices