DATAllegro solves Telecom Data Warehouse Challenges
Business Challenges
Telecommunication companies face significant business challenges as they expand services. Integration of voice, VOIP, Internet, IPTV (Internet Protocol TV) and mobile services increases data volumes and broadens the need for analytics to predict subscriber usage. Call Detail Records (CDR) often range between 500 million and 2 billion rows a day just for voice data. Data retention is increasing from the typical 3-4 months of data to two years to meet EU and other regulations.
As always the need to analyze CDR Records to reduce customer churn in the face of strong competition exists in the data warehouse. Customer retention reduces the investment required to win new subscribers and increases network efficiency. Churn increases exposure to fraud and bad debt and reduces investor confidence. Churn analysis is increasing in complexity as business and residential subscribers split their business between several network operators.
Security continues to cause management concern given the data warehouse is often one of the largest sources of sensitive customer information in the company.
Pressure to meet these business challenges results in technical data warehouse challenges. S olutions must be flexible enough to manage demands for new data sources and new application requirements. Mediation of CDR to produce billing aggregates requires high performance on large volumes. The data warehouse infrastructure must provide high-speed backup within operational windows and high-speed loading for high volumes. To meet the challenge of round-the-clock processing, the solution should provide high availability and near real-time loads.
Traditional data warehouse technologies are often unable to provide reasonable response times in handling expanding CDR volume. Managing these solutions is often complex and expensive. First generation data warehouse appliances, while offering more cost-effective solutions then their traditional counterparts, do not provide the sophistication needed to deal with the complexities of CDR mediation. A second generation of data warehouse appliances offers a compelling set of benefits to deal with the challenges of telecom data warehousing and enable telecom carriers to build data warehouses solutions that will grow as the business grows.
The Solution
The DATAllegro v3 data warehouse appliance is a second generation data warehouse appliance based on a standards and enterprise-class device platform that provides solutions for both CDR mediation and large volume data warehousing.
Customers use DATAllegro v3 to load five billion or more CDR records a day into a DATAllegro data warehouse appliance for CDR mediation and augmentation using ETL tools. High-speed parallel loading allows updates to the DATAllegro enterprise data warehouse containing several hundred terabytes of customer data. In the data warehouse, high performance aggregation meets the needs of margin analysis, fraud detection and billing dispute management.
Break through performance for mediation and analytics
The DATAllegro v3 data warehouse appliance provides a reliable high performance solution for CDR mediation and complex analytics on large data volumes. Several technologies provide performance that is ten to one hundred times faster than traditional solutions:
- Massive parallel processing - DATAllegro Redundant Array of Inexpensive Data warehouses (RAIDW)™ software coupled with standard servers from Dell®, storage units from EMC® and open source Ingres offers high parallel query performance.
- Direct Data Streaming - DATAllegro created Direct Data Streaming (DDS) ™ using software, multi-level partitioning and hardware features to stream data off disk sequentially at high speeds.
- Ultra Shared Nothing - DATAllegro invented Ultra Shared Nothing (USN)™ architecture to minimize data movement and support high performance aggregation, locally resolve joins and dynamic shuffle technology ensure high speed CDR mediation processing.
- CPU Throughput - Intel® Quad Core Xeon™ processors stream data and process queries in parallel, ensuring the architecture is not bound by I/O and can advance as processor technology advances.
- Compression - I/O throughput increases from 800MBps to over 1.2GBps with compression. As a result, tables scan speeds range from 0.5TB/minute to 10.5TB/minute depending on the number of data racks in the appliance.
- I/O separation – DATAllegro v3 separates workspace and user data space, ensuring consistent performance for aggregation. Workspace usage is usually random, while user data space is accessed sequentially in DATAllegro’s appliance. Workspace is stored locally on the Dell compute nodes used to process queries, while user data is stored on EMC Cx series storage nodes for reliability. The physical separation improved performance by reducing disk head movement.
- High-Speed Network Fabric – The nodes within a DATAllegro v3 appliance are connected through a dual 10Gbps (per port) InfiniBand Interconnect. The RDMA (Remote direct memory access) protocol in use reduces the overhead on processors for large data transfers. As a result, the interconnect has over twenty times the bandwidth of GigE. The high-speed fabric allows multiple appliances to exist on the same network so updates between the CDR mediation appliance and the enterprise data warehouse occur quickly.
- Standards and Enterprise Class technologies – As storage and CPU technology advances, overall performance of the appliance will increase.
The result is a solution for CDR that enables business intelligence and analytics with high query performance and unprecedented reliability.
Enterprise Level Reliability
Major technology partners reduce the risk for managing valuable data warehouse assets. Standards and enterprise class devices within the appliance combine to ensure there is no single point of failure. Storage nodes interface to both primary and warm standby compute nodes through high-speed interconnects. EMC Storage nodes are 100% reliable providing both mirroring and hot standby disks. The appliance configuration includes redundant networks, redundant power domains and spare disks. The result is an appliance that offers enterprise level redundancy and high MTBF.
DATAllegro is the first data warehouse appliance that provides active/active load balancing with appliances in different physical locations.
Scalability and Flexibility to support multiple solutions
DATAllegro V3 offers a flexible architecture at a low price point designed to scale as needed. The multi-rack appliance (MRA) allows multiple data storage racks to combine with a single control rack to create appliances than can scale from 15TB to a Petabyte. Storage can be released on demand within an appliance for 15K/TB. Data racks can be added to the appliance in 15TB and 25TB increments to meet high volume requirements. A landing zone for high-speed loading and flexible backup architecture enhances the ability to scale to high volumes. DATAllegro v3 offers price/performance combinations to support enterprise data warehouses where both detail and summary records are stored and high performance is provided for extensive analytics and regulatory requirements. Low cost solutions support a pre-aggregation or mediation engine for CDR. DATAllegro v3 uses workload management and system balancing to support mixed data warehouse workloads with a combination of ad hoc reporting, parallel aggregation, concurrent loading and enterprise reporting.
Reduced Administration
Database administration is simpler than traditional solutions with automated space management, reduced tuning and utilities for high-speed loading and backup. Second-generation appliances such as DATAllegro V3 offer utilities for query tuning and optimization for concurrency, workload and throughput requirements. High performance for near real time loads alongside both short and long running queries ensures support for complex workloads.
Security
DATAllegro is the first data warehouse appliance to offer encryption. DATAllegro encrypts data at rest. This hardware-based encryption ensures minimal impact to performance or space utilization. Encryption is transparent to business intelligence and ETL tools accessing the appliance.
A comprehensive and sophisticated Infrastructure
CDR data warehouses require high-speed loading and aggregation to manage high data volumes and frequent refresh cycles. DATAllegro V3 provides high-speed loading using parallel processing across all nodes to ensure load speeds of 1TB/hour or more are attainable, while minimizing the impact on overall performance.
DATAllegro provides a platform that manages both analytical and operational data warehouses to manage CDR data. DATAllegro is uniquely able to support a mixed workload of both predefined and ad hoc queries. Automated Workload management and system resource balancing supports a workload of both simple and complex queries.
New Solutions for CDR
DATAllegro v3 is able to connect several appliances on the InfiniBand network so specific workloads can targeted on each appliance and high performance data transfer provides near real time data on each. The “divide and conquer” approach provides flexibility to architect new solutions for CDR analysis. DATAllegro is able to transform CDR records into business intelligence while i mproving service level agreements with high query performance. Reduced administration and data warehousing costs alongside this performance ensures that as CDR analysis needs explode and traditional technologies cannot keep up, DATAllegro provides a solution that will fit both today’s and tomorrow’s telecom data warehousing volumes.
|