Google announced the open sourcing of its Cloud Dataflow SDK for Java in a move it says will make it easier for developers to integrate its managed service while forming the basis for porting Cloud Dataflow to other languages and execution environments.
Google first unveiled Cloud Dataflow back in June.
“We created Cloud Dataflow, which is now currently an alpha release, as a platform to democratize large scale data processing by enabling easier and more scalable access to data for data scientists, data analysts and data-centric developers,” says Google software engineer Sam McVeety. “Regardless of role or goal – users can discover meaningful results from their data via simple and intuitive programing concepts, without the extra noise from managing distributed systems.”
“We’ve learned a lot about how to turn data into intelligence as the original FlumeJava programming models (basis for Cloud Dataflow) have continued to evolve internally at Google,” McVeety says.
Google says it’s open sourcing it so developers can “spur innovation in combining stream and batch based processing models,” adapt the Dataflow programming model to other languages, and execute Dataflow on other service environments.